Jobs
My ads
My job alerts
Sign in
Find a job Career Tips Companies
Find

Careers site reliability engineer team lead

Newtown (Powys)
Permanent
Cambridge University Press & Assessment
Site reliability engineer
Posted: 22 January
Offer description

Overview

The SRE Team Lead will lead a mature Site Reliability Engineering function within the Platform Operations Team, working closely with Platform Support and Engineering teams. This role demands strong thought leadership, technical depth, and strategic direction for the discipline, with a particular emphasis on leveraging AI-driven operations (AIOps) and FinOps practices to optimise reliability, performance, and cloud spend. Although this is a hands-on technical role, the SRE Team Lead will also manage a small team of SRE, providing clear direction and ensuring consistent, data-driven, AI-enhanced service delivery across the platforms while working collaboratively with existing support and engineering groups.


Responsibilities

* Apply core SRE and DevOps principles - culture, automation, testing, measurement, and continuous improvement - to build and optimise pipelines focused on rapid, reliable software delivery. Integrate AIOps capabilities, such as automated anomaly detection and intelligent alerting, to further enhance operational excellence.
* Work with Solutions Architecture, Development, and QA teams to automate processes wherever possible, creating and improving stable CI/CD pipelines for both software and infrastructure. Develop tools that enable rapid provisioning of environments and resources across all teams, incorporating AI-assisted automation where beneficial.
* Use automation, observability, and monitoring tools to improve site reliability and proactively identify issues. Support development teams with troubleshooting, particularly in infrastructure, networking, and multi‑tier application design. Serve as a subject matter expert for cloud services—especially AWS PaaS—while applying FinOps practices to ensure cloud cost transparency, optimisation, and efficient resource usage.
* Create and maintain robust technical documentation for the infrastructure of the English platforms, including operational runbooks enhanced with predictive and AI-supported insights.
* Stay engaged with developments in the SRE, DevOps, AIOps, and FinOps communities, continually introducing new practices and technologies to improve reliability, performance, automation, and cloud cost efficiency.
* This position has been classified as a hybrid role, requiring the selected candidate to typically spend 40-60% of their time collaborating and connecting face-to-face at their dedicated location. Aside from our hybrid principles, other flexible working requests will be considered from the first day of employment, including other work arrangements should you require adjustments due to a disability or long-term health condition.


Qualifications

* Demonstrable passion for Site Reliability Engineering and drive to understand, anticipate, and counter platform-related issues before they become problems; continually stay up to date with the latest technological trends and developments.
* Great communication with the ability to collaborate across technical leadership and various business stakeholders, presenting ideas and strategies clearly and persuasively.
* Soft skills in motivating, inspiring, and leading a team (direct line management is not part of the role’s remit).
* Educated to degree level or equivalent with a minimum of 5 years proven experience in a systems administration or DevOps blended role.
* Experience implementing technologies such as Terraform, GitHub Actions, and containerization/orchestration (e.g., Kubernetes & Docker).
* Expertise in monitoring tools like New Relic, Grafana, Alert Manager, and site24x7.
* Extensive knowledge of cloud computing infrastructure, especially using Amazon Web Services (EKS, ECS, RDS, Route53, etc.).
* Excellent troubleshooting, debugging, communication, and documentation skills.
* Experience of working within an Agile product development environment.


About Cambridge University Press & Assessment

We are Cambridge University Press & Assessment, a world-leading academic publisher and assessment organisation and a proud part of the University of Cambridge. Joining us is your opportunity to pursue potential. You will belong to a collaborative team that is exploring new and better ways to serve students, teachers and researchers across the globe - for the benefit of individuals, society and the world. Sharing our mission will inspire your own growth, development and progress, in an environment which embraces difference, change and aspiration.

Cambridge University Press & Assessment is committed to being a place where anyone can enjoy a successful career, where it is safe to speak up, and where we learn continuously to improve together. We welcome applications from all candidates, regardless of demographic characteristics (age, disability, educational attainment, ethnicity, gender, marital status, neurodiversity, religion, sex, gender identity and sexual identity), cultural, or social class/background. We believe better outcomes come through diversity of thought, background and approach. We welcome applications from people from all backgrounds and communities, actively seeking to employ people from a wide range of different communities. If you are ready to take the next step in your Cambridge journey, we welcome your application. Together, we continue to shape a culture where everyone feels empowered to succeed and motivated to make a difference - for ourselves, for each other, and for learners worldwide.


Benefits

* 28 days annual leave plus bank holidays
* Private medical and Permanent Health Insurance
* Discretionary annual bonus
* Group personal pension scheme
* Life assurance up to 4 x annual salary
* Green travel schemes
#J-18808-Ljbffr

Apply
Create E-mail Alert
Job alert activated
Saved
Save
Similar job
Site reliability engineer: automate & observe for scale
Newtown (Powys)
Permanent
Altium LLC
Site reliability engineer
€70,000 a year
See more jobs
Similar jobs
Engineering jobs in Newtown (Powys)
jobs Newtown (Powys)
jobs Powys
jobs Wales
Home > Jobs > Engineering jobs > Site reliability engineer jobs > Site reliability engineer jobs in Newtown (Powys) > Careers Site Reliability Engineer Team Lead

About Jobijoba

  • Career Advice
  • Company Reviews

Search for jobs

  • Jobs by Job Title
  • Jobs by Industry
  • Jobs by Company
  • Jobs by Location
  • Jobs by Keywords

Contact / Partnership

  • Contact
  • Publish your job offers on Jobijoba

Legal notice - Terms of Service - Privacy Policy - Manage my cookies - Accessibility: Not compliant

© 2026 Jobijoba - All Rights Reserved

Apply
Create E-mail Alert
Job alert activated
Saved
Save