Staff Data Engineer – Onyx Research Data Platform (GSK R&D)
Location: London, United Kingdom — Hybrid (office & remote)
Posting Date: Jan 6 2026 | Closing Date: Jan 24 2026
Position Summary
As a Staff Data Engineer, you will lead a scrum team of world‑class data engineers, building and guiding automated, scalable, sustainable pipelines that support our scientists, engineers, and decision‑makers. You shape the engineering direction of the Onyx platform and influence best‑practice adoption across data ingestion, streaming, transformation, knowledge graphs, metadata, vectorized pipelines, and AI/GenAI integration. You partner with platform, bioinformatics, and ML teams to ensure data pipelines meet scientific and regulatory requirements.
Key Responsibilities
* Lead and mentor a team of data engineers in delivering data and knowledge products that advance GSK R&D.
* Architect end‑to‑end data pipeline patterns, abstractions, and reusable frameworks, improving reliability, observability, and developer experience.
* Define standards for data modelling, lineage, metadata, vectorization, and governance.
* Collaborate with other data engineering leads to design new data flows that maximise reuse and align with event‑driven microservice architectures.
* Work with ML and GenAI teams to optimise data flows for fine‑tuning, prompt engineering, and retrieval‑augmented generation, championing best practices for GenAI workloads.
* Drive engineering discipline, including QMS framework and CI/CD best practices, and spearhead continuous improvement in the area.
* Serve as a technical knowledge holder, continuously building expertise and sharing insights across the organization.
Basic Qualifications
* Bachelor’s degree in Data Engineering, Computer Science, Software Engineering or related discipline.
* Strong data engineering experience in industry.
* Experience with agile software development (Jira, Confluence).
* Proven ability to overcome high‑volume, high‑compute challenges.
* Familiarity with orchestrating tooling and cloud platforms (AWS, GCP, Azure, Kubernetes).
* Experience in automated testing and design.
* Experience with DevOps practices and CI/CD pipelines (Git/GitLab/Jenkins/etc.).
* Deep knowledge of at least one programming language (Python, Scala, Java).
* Hands‑on experience with big‑data tools (Spark, Kafka, etc.).
* Experience with IaC and automation tools (Terraform).
* Expertise in data modelling, database concepts and SQL.
Preferred Qualifications
* Master’s or PhD in Data Engineering, CS or related discipline.
* Familiarity with vector databases, embeddings, LLM data pipelines and RAG architecture.
Benefits & Working Pattern
Hybrid working model: balance in‑office collaboration with remote focus.
Application Process
Submit CV and a short note explaining why this role matters to you and how you would contribute.
We welcome candidates from all backgrounds.
Legal Notice
GSK is an Equal Opportunity Employer. This ensures that all qualified applicants will receive equal consideration for employment without regard to race, color, religion, sex (including pregnancy, gender identity, and sexual orientation), parental status, national origin, age, disability, genetic information (including family medical history), military service or any basis prohibited under federal, state or local law.
If you require adjustments to the recruitment process, please contact UKRecruitment.Adjustments@gsk.com.
#J-18808-Ljbffr