Overview
Rfam and RNAcentral are key resources for RNA biology, serving tens of thousands of users yearly and cited widely in the literature. This role is funded by the BBSRC and Wellcome and is part of the Sequence Families group led by Alex Bateman. The Bioinformatics Data Engineer will develop and maintain both the Rfam and RNAcentral databases, reporting to the Project Leader for RNA Resources and collaborating with bioinformaticians, developers, and curators.
Responsibilities
* Run, maintain and optimise data pipelines, ensuring efficient data processing, storage and retrieval for the RNA resources.
* Analyse existing curation and production pipelines, identifying opportunities for improvement, optimisation and scalability.
* Modernise and containerise Rfam curation pipelines, implementing human‑in‑the‑loop AI‑assisted curation.
* Develop and scale large‑language‑model pipelines used in RNAcentral for literature summarisation and curation.
* Define scalable workflows for ncRNA annotation in genomes.
* Document pipelines, processes and workflows for internal reference and knowledge sharing.
* Participate in RNAcentral and Rfam data releases.
* Provide outreach to the scientific community through presentations at major conferences (e.g., RNA Society Annual Meeting, ISMB) and convene regular feedback sessions with consortium members.
* Keep updated with the latest developments in RNA science to ensure the resources continue to serve user needs.
Qualifications
* Master’s level or equivalent qualification in a computational, biological or related scientific discipline.
* Proficiency in Python and other relevant bioinformatics programming languages.
* Experience with relational databases (PostgreSQL, MySQL); understanding of database architecture, performance tuning, partitioning, indexing and query optimisation.
* Track record of developing and maintaining production bioinformatics pipelines using workflow management systems such as Nextflow or Snakemake.
* Experience building applications that use large‑language‑model and other AI technologies.
* Familiarity with containerisation (Docker, Singularity) and cloud infrastructure such as OpenStack.
* Comfortable using Git/GitHub, Unix shell and Bash.
* Experience with AI‑assisted coding tools.
* Strong communication skills and ability to apply best‑practice software development methodologies.
* Knowledge of RNA biology and practical experience with Rfam, Infernal, R‑scape or secondary‑structure prediction tools (optional but desirable).
* Familiarity with gene annotation or genome feature representation.
* Experience in high‑performance computing environments such as Slurm.
* Experience planning and executing data‑migration projects, including downtime management, data consistency verification and rollback strategies.
* Experience with AI workflow libraries such as LangChain or LangGraph.
* Experience with Kubernetes and advanced cloud platforms.
* Experience with the Rust programming language.
Benefits & Compensation
* Hybrid working: 2 days per week from the office in Hinxton (Monday and Tuesday), with flexibility to come on site more often.
* Contract length: 3 years (grant‑based).
* Salary: Grade 5 monthly salary starting at £3,303 per month after tax, excluding pension and insurance contributions.
* Monthly family, child and non‑resident allowances; annual salary review; pension scheme; death benefit; long‑term care; accident‑at‑work and unemployment insurances.
* Private medical insurance for employee and immediate family (includes prescriptions and dental & optical cover).
* Generous time off: 30 days annual leave per year plus public holidays; additional family leave (child sick, parental, holiday clubs).
* Free shuttle bus to/from work, on‑site library and subsidised gym and cafeteria.
* Family benefits: on‑site nursery, 10 days child sick leave, generous parental leave.
* Visa exemption and educational grant for private schooling for non‑UK residents.
* Regular social club and sports activities on campus and remotely.
Legal & Equal Opportunity Statement
EMBL is a signatory of DORA and encourages applications from candidates of all genders, identities, nationalities and backgrounds. We offer visa exemptions to international applicants.
Closing Date
Applications will close at 23:59 CET on 28 June 2026.
#J-18808-Ljbffr