Want to help make the world a better & safer place?
This opportunity is with a well-funded research organisation in Oxford, tackling genuinely meaningful, real-world humanitarian challenges.
They’re looking for someone to take ownership of the GPU compute platform that underpins their scientists’ work every day.
In this role, you would:
✅ Build and operate GPU clusters for large-scale training and inference.
✅ Optimise data pipelines, storage systems, and networking for distributed workloads.
✅ Analyse performance and eliminate bottlenecks across the stack.
✅ Implement observability and security within a sensitive research environment.
✅ Collaborate closely with researchers on capacity planning and experimentation tooling.
What they’re looking for:
✅ Proven experience building and running ML compute infrastructure at scale.
✅ Deep understanding of GPU architecture and high-performance networking.
✅ Hands-on experience with Infrastructure as Code and CI/CD pipelines.
✅ Ability to work independently and shape solutions collaboratively.
✅ Experience migrating from traditional schedulers to containerised systems (nice to have).
Why this role?
Because the infrastructure you build will directly support work that actually makes a difference, not just optimising engagement metrics, but contributing to real humanitarian outcomes.
Compensation is also highly competitive. The budget is flexible, and they’re open to structuring a package that works for the right person, with salaries up to £300k (and potentially higher for exceptional candidates).
Interested?
Click “Easy Apply”, and I’ll be in touch with more details. No need for a polished CV at this stage - we can sort that out later.