Responsibilities
* Architect and implement scalable infrastructure for AI and ML workloads (training, evaluation, inference).
* Design and operate Kubernetes-based platforms for multi-tenant, production AI systems.
* Build and refine MLOps pipelines covering model versioning, experiment tracking, CI/CD, deployment, monitoring, and rollback.
* Establish DevOps best practices across infrastructure, application, and ML layers.
* Deploy and operate enterprise-grade production systems with strong uptime and reliability standards.
* Leverage modern AI coding agents and developer copilots to accelerate engineering workflows.
* Partner with ML engineers and application teams to translate research and product requirements into scalable infrastructure capabilities.
Qualifications
* 8-12+ years of experience in infrastructure, platform engineering, or distributed systems.
* Proven experience building and operating enterprise-grade production systems.
* Deep hands‑on expertise with Kubernetes in production (autoscaling, networking, upgrades, reliability patterns).
* Strong background in MLOps and ML platform lifecycle management.
* Experience with cloud platforms (AWS, GCP, or Azure) and Infrastructure-as-Code (Terraform, Pulumi, etc.).
* Practical, hands‑on use of AI coding agents / AI‑assisted development tools.
* Strong programming ability in Go, Python, or similar infrastructure‑oriented language.
Nice to Have
* Experience supporting GPU workloads and large-scale training/inference.
* Familiarity with enterprise security standards (SOC2, ISO, zero‑trust architectures).
* Experience building internal developer platforms serving multiple teams.
* Background supporting AI systems in regulated or high‑reliability environments.
Additional Information
At Version 1, we believe in providing our employees with a comprehensive benefits package that prioritises their wellbeing, professional growth, and financial stability.
* Share in our success with our Quarterly Performance-Related Profit Share Scheme, where employees collectively benefit from a share of our company's profits
* Strong Career Progression & mentorship coaching through our Strength in Balance & Leadership schemes with a dedicated quarterly Pathways Career Development programme
* Flexible/remote working, Version 1 is tremendously understanding of life events and people’s individual circumstances and offer flexibility to help achieve a healthy work life balance
* Financial Wellbeing initiatives including; Pension, Private Healthcare Cover, Life Assurance, Financial advice and an Employee Discount scheme
* Employee Wellbeing schemes including Gym Discounts, Bike to Work, Fitness classes, Mindfulness Workshops, Employee Assistance Programme and much more. Generous holiday allowance, enhanced maternity/paternity leave, marriage/civil partnership leave and special leave policies
* Educational assistance, incentivised certifications, and accreditations, including AWS, Microsoft, Oracle, and Red Hat
* Reward schemes including Version 1’s Annual Excellence Awards & ‘Call-Out’ platform.
* Environment, Social and Community First initiatives allow you to get involved in local fundraising and development opportunities as part of fostering our diversity, inclusion and belonging schemes.
#J-18808-Ljbffr