Embl

Biotechnology

BioinformaticsDataEngineer(RNAResources)

£60–85k ~AI est. Hinxton, Cambridgeshire, United Kingdom FULL TIME Remote Friendly
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid+ candidates.

The Brief

“Bioinformatics Data Engineer (RNA Resources) at Embl. Skills: Data Engineering, Bioinformatics, RNA biology, LLM pipelines. Run data pipelines. Maintain data pipelines”

Industry & Context.

Biotechnology
Problems you'll solve

Identify areas for improvement; Identify areas for optimisation; Identify areas for scalability

What They're Looking For.

Must Have

Master’s level or equivalent qualification, Proficiency in Python, Experience with relational databases, Experience with SQL, Demonstrated track record of developing and maintaining production bioinformatics pipelines, Experience building applications with LLMs, Familiarity with Docker or other containerisation technologies, Comfortable using Git/GitHub, Comfortable using Unix, Comfortable using Bash, Experience of using AI assisted coding tools, Ability to apply best-practice software development methodologies

Nice to Have

Knowledge of RNA biology, Practical experience with Rfam, Practical experience with Infernal, Practical experience with R-scape, Experience with tools for secondary structure prediction, Familiarity with gene annotation, Familiarity with genome feature representation, Experience with high-performance computing environments, Experience in planning and executing data migration projects, Experience with AI workflow libraries, Experience with Kubernetes, Experience with cloud infrastructure platforms, Experience with the Rust programming language

What You'll Do.

Maintain data pipelines

Optimise data pipelines

Analyse existing data curation pipelines

Analyse data production pipelines

Identify areas for improvement

Identify areas for optimisation

Identify areas for scalability

Modernise Rfam curation pipelines

Containerise Rfam curation pipelines

Implement AI-assisted agentic curation

Develop LLM pipelines

Develop scalable workflows

Annotate ncRNA in genomes

Document data pipelines

Participate in data releases

Present at major conferences

Present at consortium meetings

Present at advisory board meetings

Gather feedback from community members

Keep up to date with RNA science developments

How You'll Work.

Team & Collaboration

Cross-functional teams; RNA bioinformatician; Full-stack software developers; Rfam biocurator

Communication Scope

Presentations; Knowledge sharing

Full Job Description

About the Team Rfam and RNAcentral are key resources for RNA biology, serving tens of thousands of users every year and widely cited in the scientific literature. We are recruiting a Bioinformatics Data Engineer to develop and maintain both the [_Rfam_](https://rfam.org/) and [_RNAcentral_](https://rnacentral.org/) databases. They are currently funded by the [_BBSRC_](https://bbsrc.ukri.org/) and [_Wellcome_](http://wellcome.org). The RNA Resources team is part of the Sequence Families group led by [_Alex Bateman_](https://www.ebi.ac.uk/about/people/alex-bateman). You will be reporting to the Project Leader for RNA Resources, and working closely with an RNA bioinformatician, two full-stack software developers, and an Rfam biocurator. Your role As a Bioinformatics Data Engineer, you will run, maintain and optimise our data pipelines, ensuring efficient data processing, storage and retrieval for Rfam and RNAcentral. You will work closely with cross-functional teams to analyse requirements, propose new data pipeline architectures, and implement solutions to improve performance and scalability. The tasks will include: * Analysing existing data curation and data production pipelines and identifying areas for improvement, optimisation, and scalability. * Modernising and containerising Rfam curation pipelines, and implementing human-in-the-loop, AI-assisted agentic curation. * Developing and scaling LLM pipelines used in RNAcentral for literature summarisation and curation. * Developing scalable workflows for ncRNA annotation in genomes. * Documenting data pipelines, processes, and workflows for internal reference and knowledge sharing. * Participating in RNAcentral and Rfam data releases. You will also be responsible for outreach to the scientific community through presentations at major conferences such as the RNA Society Annual Meeting and ISMB. Additionally, you will present at the RNAcentral consortium meetings and Scientific Advisory Board meetings, gathering regular

Free ATS check

Applying for this Bioinformatics Data Engineer (RNA Resources) role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Workday

  • Workday has a multi-step form — save your progress after every section.
  • "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
  • Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
  • Job requisition numbers are useful when following up with HR by email.

ANONYMOUS · UNFILTERED

What do employees actually say about Embl?

Real rants from real employees. Read before you apply.

Read Company Rants →