Senior Platform Engineer
Build and operate the platforms that make AI and machine learning work at scale
We're looking for a Senior Platform Engineer to join our team and play a key role in designing and operating the platform that underpins AI and machine learning delivery.
This is a hands‑on senior platform role, focused on building robust, Kubernetes‑based platforms that enable MLOps engineers, ML engineers, and data scientists to deploy, run, and manage models safely and effectively in production.
While you'll need a strong understanding of how machine learning and LLM workloads are trained, packaged, deployed, and served, this is not a "deploy models all day" role. Instead, your impact will come from creating the infrastructure, tooling, workflows, and guardrails that allow others to do that work reliably and at scale.
What you'll be doing
You'll be responsible for building a production‑grade AI / ML platform, not just running clusters.
You will:
1. Design, build, and operate a Kubernetes‑based platform that supports multiple ML and engineering teams
2. Extend Kubernetes with MLOps‑specific capabilities, rather than treating it as a finished product
3. Provideplatform‑level support for:Model development and experimentationModel packaging, deployment, and promotionScalable inference and LLM‑based workloads
4. Build shared platform services that enable consistent, repeatable model deployment, even where day‑to‑day deployment is owned by MLOps or ML engineers
5. Work closely with data scientists and MLOps engineers to ensure the platform is genuinely usable and fit for purpose
6. Own platform operability, reliability, security, and lifecycle management in production
7. Troubleshoot complex issues that cut across infrastructure, Kubernetes, and MLOps layers
8. Contribute to architectural decisions while remaining hands‑on with implementation
What we're looking for
This role is ideal for someone who sees themselves first and foremost as a platform engineer, with the depth to support AI and ML workloads properly.
Essential experience:
9. Strong background as a Senior Platform Engineer or Senior DevOps Engineer
10. Deep, hands‑on experience building and operating Kubernetes‑based platforms
11. Strong practical experience with Helm and Infrastructure as Code (. Terraform)
12. Proven experience building internal platforms for other engineers, not just running workloads
13. Strong grasp of operational fundamentals: monitoring, logging, reliability, incidents, and maintainability
14. Comfortable collaborating closely with MLOps engineers and data scientists, even where responsibilities differ
ML platform & MLOps knowledge (important)
You don't need to be a full‑time MLOps engineer - but you do need practical understanding of how ML and AI workloads behave in production.
Experience or exposure to areas such as:
15. MLOps platforms (. Kubeflow or similar frameworks)
16. Model serving and inference platforms (. KServe, vLLM, or equivalent)
17. Supporting LLM‑based workloads, including performance and scaling considerations
18. Notebook environments such as JupyterHub
19. Awareness of emerging tooling around Responsible / Trustworthy AI or comparable solutions
This ensures you're building a platform that actually works for AI use cases - not a generic compute layer.
Desirable experience
20. Working in organisations with a clear AI or data platform strategy
21. Supporting data scientists or ML engineers at scale
22. Experience in regulated, secure, or high‑assurance environments
23. Designing platforms that balance flexibility, governance, and control
Guidant, Carbon60, Lorien & SRG - The Impellam Group Portfolio are acting as an Employment Business in relation to this vacancy.