Sr. ai infrastructure engineer

Renfrew

Lenovo

Infrastructure engineer

Posted: 6 April

Offer description

Description and Requirements

This role is open for the Edinburgh, Scotland location only. Candidates must be based there, as the position requires working from the office at least three days per week (3:2 hybrid policy).

The Lenovo AI Technology Center (LATC)—Lenovo’s global AI Center of Excellence—is driving our transformation into an AI-first organization. We are assembling a world-class team of researchers, engineers, and innovators to position Lenovo and its customers at the forefront of the generational shift toward AI. Lenovo is one of the world’s leading computing companies, delivering products across the entire technology spectrum, spanning wearables, smartphones (Motorola), laptops (ThinkPad, Yoga), PCs, workstations, servers, and services/solutions. This unmatched breadth gives us a unique canvas for AI innovation, including the ability to rapidly deploy cutting-edge foundation models and to enable flexible, hybrid-cloud, and agentic computing across our full product portfolio. To this end, we are building the next wave of AI core technologies and platforms that leverage and evolve with the fast-moving AI ecosystem, including novel model and agentic orchestration & collaboration across mobile, edge, and cloud resources. This space is evolving fast and so are we. If you’re ready to shape AI at a truly global scale, with products that touch every corner of life and work, there’s no better time to join us.

Summary

Lenovo is seeking a senior technical leader to guide the strategy, architecture, and delivery of our next generation Hybrid AI Platform. In this role, you will provide leadership across AI infrastructure, MLOps, cloud native platform engineering, and operational excellence -setting direction for teams that build and run production grade AI/ML platforms on Kubernetes. You will drive the vision for scalable, secure, and reliable AI systems while partnering closely with engineering, product, and executive stakeholders. If you are passionate about leading high impact AI platform initiatives, mentoring engineering talent, and shaping enterprise wide Hybrid AI capabilities, we invite you to join us.

Responsibilities

AI Platform Engineering & Operations

1. Provide technical leadership and architectural direction for Kubernetes/OpenShift based AI/ML platform design, scalability strategy, security posture, and operational standards.
2. Oversee platform roadmap, ensuring alignment with Lenovo’s broader Hybrid AI strategy and enterprise architecture principles.
3. Lead engineering teams in implementing GitOps driven, cloud native platform automation using ArgoCD and Helm.
4. Set standards for Linux systems management, platform hardening, and operational reliability across all AI infrastructure.

MLOps & Model Lifecycle Management

5. Define and evolve the enterprise MLOps architecture, enabling reproducible, automated, and governed AI model workflows.
6. Lead teams in building and optimizing ML pipelines using Kubeflow Pipelines, Tekton, and Python SDKs.
7. Architect scalable, production ready model serving solutions using KServe, Knative, and Triton (where applicable).
8. Champion consistency in model registry usage, metadata management, workflow orchestration, and ML lifecycle governance.

Automation, Observability & Reliability

9. Develop the long term automation and platform SRE strategy, including Python/Ansible based automation and Terraform driven IaC patterns.
10. Establish observability standards for AI/ML systems using Prometheus, Grafana, AlertManager, and related tooling.
11. Oversee capacity planning, performance engineering, incident response processes, and continuous reliability improvements.
12. Drive adoption of automation first principles to reduce operational overhead and improve engineering velocity.

Cloud & Infrastructure Integration

13. Own the multi cloud and hybrid cloud integration strategy across AWS, GCP, Azure, and on premises environments.
14. Direct the design of enterprise grade identity and security integrations (Azure AD, LDAP, RBAC, secrets management).
15. Partner with cloud, security, and networking leadership to ensure the AI platform meets enterprise compliance and governance requirements.

Collaboration & Customer Success

16. Act as a senior point of technical escalation for internal teams and critical customer deployments.
17. Influence cross functional strategy across AI engineering, DevOps, data science, and product teams.
18. Mentor staff engineers and up level the team’s capabilities through architectural reviews, technical coaching, and leadership by example.
19. Represent the platform’s strategy and progress to leadership stakeholders, ensuring alignment with business goals and customer needs.

Required qualifications:

20. Bachelor’s degree in Computer Science, Engineering, or related field (Master’s preferred).
21. 10+ years of experience in DevOps, cloud native platform engineering, or AI/ML platform operations, including leadership or architectural responsibility.
22. Proven expertise in Kubernetes/OpenShift platform leadership, including cluster lifecycle management, operator design, advanced networking, and platform level security.
23. In depth experience with GitOps at scale using ArgoCD, Helm, and automated cluster configuration patterns.
24. Advanced knowledge of MLOps tooling KServe, Kubeflow, Tekton, Knative) and ML workflow automation.
25. Strong proficiency in Python, Bash, and automation frameworks like Ansible and Terraform.
26. Deep experience with AWS, GCP, Azure, and hybrid cloud architectural patterns.
27. Strong observability leadership experience with Prometheus, Grafana, and distributed system monitoring.
28. Exceptional communication, stakeholder management, and cross functional leadership skills.
29. Proven track record shaping technical strategy, influencing engineering culture, and delivering complex, large scale platforms.

Bonus Points

30. Experience leading initiatives within the Red Hat OpenShift AI ecosystem.
31. Knowledge of enterprise scale LLM and model serving architectures Triton ensemble models, OCI artifact based LLM deployments).
32. Advanced industry certifications such as CKA, CKS, GCP ACE, AWS SAA/SA Pro, or Red Hat OpenShift specializations.
33. Experience guiding data engineering or AI/ML workflow orchestration teams.
34. Demonstrated leadership in monorepo based CI/CD modernization initiatives.
35. Experience implementing and governing Internal Developer Portals Backstage) across large engineering organizations.
What we offer:
36. Opportunities for career advancement and personal development
37. Access to a diverse range of training programs
38. Performance-based rewards that celebrate your achievements
39. Flexibility with a hybrid work model (3:2) that blends home and office life
40. Electric car salary sacrifice scheme
41. Life insurance

#LATC

Apply

Create E-mail Alert

Save

Similar job

Cryptography infrastructure engineer (cybersecurity)

Glasgow (Glasgow City)

Morgan Stanley

Infrastructure engineer

€80,000 a year

Similar job

Infrastructure engineer: automation & reliability

Glasgow (Glasgow City)

hackajob

Infrastructure engineer

€50,000 a year

Similar job

Senior cryptography infrastructure engineer - pki and hsm

Glasgow (Glasgow City)

Morgan-Stanley

Infrastructure engineer

€90,000 a year