About The Team
Rfam and RNAcentral are key resources for RNA biology, serving tens of thousands of users every year and widely cited in the scientific literature.
We are recruiting a Bioinformatics Data Engineer to develop and maintain both the Rfam and RNAcentral databases. They are funded by the BBSRC and Wellcome. The RNA Resources team is part of the Sequence Families group led by Alex Bateman. The role reports to the Project Leader for RNA Resources and works closely with an RNA bioinformatician, two full-stack software developers, and an Rfam biocurator.
Responsibilities
* Run, maintain, and optimise data pipelines to ensure efficient processing, storage, and retrieval for Rfam and RNAcentral.
* Analyse requirements and propose new data pipeline architectures that improve performance and scalability.
* Analyse existing data curation and data production pipelines and identify areas for improvement, optimisation, and scalability.
* Modernise and containerise Rfam curation pipelines, and implement human‑in‑the‑loop, AI‑assisted agentic curation.
* Develop and scale LLM pipelines used in RNAcentral for literature summarisation and curation.
* Develop scalable workflows for ncRNA annotation in genomes.
* Document data pipelines, processes, and workflows for internal reference and knowledge sharing.
* Participate in RNAcentral and Rfam data releases.
* Outreach to the scientific community through presentations at major conferences and consortium meetings.
* Keep up to date with the latest developments in RNA science to ensure the resources provide valuable data and analysis.
Qualifications
* Master’s level or equivalent qualification in a computational, biological or related scientific discipline.
* Proficiency in Python and other relevant languages for bioinformatics tool development.
* Experience with relational databases (PostgreSQL, MySQL) and SQL: knowledge of database architecture, performance tuning, partitioning strategies, indexing techniques, and query optimisation.
* Track record of developing and maintaining production bioinformatics pipelines with workflow management systems such as Nextflow or Snakemake.
* Experience building applications with LLMs and other AI technologies.
* Familiarity with Docker or other containerisation technologies, such as Singularity.
* Comfortable using Git/GitHub, Unix, and Bash.
* Experience with AI assisted coding tools.
* Ability to apply best‑practice software development methodologies.
* Strong communication skills.
Preferred Qualifications
* Knowledge of RNA biology and practical experience with Rfam, Infernal, R‑scape, and tools for secondary structure prediction.
* Familiarity with gene annotation or genome feature representation.
* Experience with high‑performance computing environments such as Slurm.
* Experience planning and executing data migration projects, including downtime management, data consistency verification, and rollback strategies.
* Experience with AI workflow libraries such as LangChain and LangGraph.
* Experience with Kubernetes and cloud infrastructure platforms such as OpenStack.
* Experience with the Rust programming language.
Other Helpful Information
* Hybrid Working: two days from the office in Hinxton per week, with flexibility for onsite work.
* Contract length: 3 years (grant‑based contract).
* Salary: Grade 5 monthly salary starting at £3,303 per month after tax (excluding pension and insurance contributions).
* Benefits: monthly family, child, and non‑resident allowances; annual salary review; pension scheme; death benefit; long‑term care; accident‑at‑work and unemployment insurances; private medical insurance for you and your immediate family; 30 days annual leave plus public holidays; relocation package with installation grant if required; campus life facilities; family benefits including on‑site nursery and generous parental leave; benefits for non‑UK residents such as visa exemption and monthly non‑resident allowance.
Diversity & Inclusion
We believe diverse teams drive innovation and scientific excellence. We encourage applications from candidates of all genders, identities, nationalities, and any other diverse backgrounds.
Closing Date
28/06/2026
#J-18808-Ljbffr