Jobs
My ads
My job alerts
Sign in
Find a job Career Tips Companies
Find

Hpc engineer - generative biology institute

Oxford
Ellison Institute of Technology Oxford
Engineer
€52,500 a year
Posted: 13h ago
Offer description

Your Role

Working as part of a new Scientific Computing team within GBI, the HPC Engineer will help operate, improve, and scale the data and computing platform that will enable cutting‑edge research in engineering biology. This is a broad, hands‑on role at the interface of Linux systems, high‑performance computing, cloud infrastructure, Kubernetes, Slurm, storage, monitoring, and researcher support. They will help turn emerging researcher needs and operational lessons into robust platform improvements, reusable tooling, and clear runbooks.

This role is particularly suited to someone who enjoys practical systems work, learning new technologies, and collaborating closely with scientists and engineers. We do not expect candidates to have deep experience in every technology listed in this description. Instead, we are looking for a strong, scientifically minded systems engineer: someone who can troubleshoot complex environments, communicate clearly with multidisciplinary teams, learn unfamiliar tools quickly, and help build reliable, scalable services that advance GBI’s scientific mission.


Key Responsibilities

* Operate, maintain, and improve GBI’s hybrid HPC platform, including Linux‑based compute environments, Slurm/Slinky workloads, Kubernetes/OKE services, Open OnDemand, GPU and CPU partitions, and shared storage
* Help provision, configure, scale, and validate compute, storage, networking, and platform services using infrastructure as code, configuration management, and automation tools such as Terraform, Helm and Ansible
* Monitor platform health, capacity, job scheduling, GPU utilisation, storage behaviour, and network performance; investigate issues using tools such as Prometheus and Grafana
* Support researchers in using our Scientific Computing Platform, including triaging user issues and translating common pain points into platform improvements
* Build and maintain reproducible runtime environments, container images, and workflow‑supporting services for scientific computing workloads, including bioinformatics, AI/ML, data processing, and simulation workflows
* Contribute to safe rollout and maintenance processes for Slurm images, worker node pools, scheduler configuration, container runtime changes, security updates, and monitoring improvements
* Create and maintain clear technical documentation, runbooks, validation checks, and issue/PR notes so the platform can be operated consistently and improved safely by the wider team


Requirements


Essential Knowledge, Skills and Experience

* Bachelor’s or Master’s degree in Computer Science, Computational Biology, Engineering, Physics, Mathematics, or a related discipline, or equivalent practical experience
* Hands‑on experience supporting or administering Linux‑based systems in an HPC, cloud, research, academic, or production environment
* Working knowledge of HPC or batch‑computing concepts, including schedulers, resource requests, queues/partitions, shared filesystems, and multi‑user compute environments; Slurm experience is preferred
* Ability to troubleshoot issues across systems, networking, storage, identity, containers, schedulers, and user workloads, and to follow problems through to a reliable operational fix
* Experience with scripting, automation, and version‑controlled operational changes using tools such as Git, CI/CD, Terraform, Ansible, Helm, or similar
* Ability to work closely with multidisciplinary research teams, understand scientific computing needs, and deliver practical services that advance scientific goals
* Strong communication and documentation skills, with the ability to explain technical concepts clearly to scientists, engineers, and non‑specialist audiences
* A proactive, learning‑oriented approach suited to a new team building and improving a platform while also operating it day to day


Desirable Knowledge, Skills and Experience

* Experience operating Slurm clusters, Slinky/slurm‑operator, Open OnDemand, JupyterLab services, or other researcher‑facing HPC portals and access patterns
* Experience with Kubernetes or managed Kubernetes platforms such as OCI OKE, EKS, GKE, or AKS, including Helm, Argo CD, operators, services, storage classes, and workload troubleshooting
* Experience with cloud infrastructure, particularly OCI, and with infrastructure as code and remote execution models such as Terraform Cloud
* Experience with shared and high‑performance storage such as Lustre, BeeGFS, GPFS, NFS, OCI File Storage, object storage, or data movement workflows for large scientific datasets
* Experience supporting GPU‑accelerated workloads, NVIDIA tooling, CUDA‑aware environments, DCGM metrics, GPU health monitoring, and/or AI/ML and bioinformatics workloads on shared compute platforms
* Experience with containerised HPC and scientific workflow tooling, such as Apptainer/Singularity, Docker/Podman, Pyxis/Enroot, Nextflow, Snakemake, CWL, or WDL
* Experience building monitoring and operational dashboards using Prometheus, Grafana, exporter metrics, alerting rules, or capacity and reliability reporting
* Familiarity with identity, access, and security controls in Linux or research environments, such as OIDC, Okta ASA/PAM, least‑privilege access, and security patching
* Experience working in a scientific, academic, life‑science, or research computing environment where requirements evolve through close collaboration with researchers


Benefits

* Salary: Competitive + travel allowance + bonus
* Enhanced holiday pay
* Pension
* Life Assurance
* Income Protection
* Private Medical Insurance
* Hospital Cash Plan
* Therapy Services
* Perk Box
* Electric Car Scheme


Working Together – What It Involves

* You must have the right to work permanently in the UK with a willingness to travel as necessary. In certain cases, we can consider sponsorship, and this will be assessed on a case‑by‑case basis
* You will live in, or within easy commuting distance of, Oxford (or be willing to relocate)
* Hybrid working
#J-18808-Ljbffr

Apply
Create E-mail Alert
Job alert activated
Saved
Save
Similar job
Water hygiene engineer
Oxford
H2O Environmental Services Ltd
Engineer
£30,000 a year
Similar job
Senior c&i engineer
Abingdon
UK Atomic Energy Authority
Engineer
£55,000 a year
Similar job
Bms service & small works engineer
Oxford
SER (Staffing) Ltd
Engineer
£60,000 a year
See more jobs
Similar jobs
Engineering jobs in Oxford
jobs Oxford
jobs Oxfordshire
jobs England
Home > Jobs > Engineering jobs > Engineer jobs > Engineer jobs in Oxford > HPC Engineer - Generative Biology Institute

About Jobijoba

  • Career Advice
  • Company Reviews

Search for jobs

  • Jobs by Job Title
  • Jobs by Industry
  • Jobs by Company
  • Jobs by Location
  • Jobs by Keywords

Contact / Partnership

  • Contact
  • Publish your job offers on Jobijoba

Legal notice - Terms of Service - Privacy Policy - Manage my cookies - Accessibility: Not compliant

© 2026 Jobijoba - All Rights Reserved

Apply
Create E-mail Alert
Job alert activated
Saved
Save