Azure Site Reliability Engineer
£75,000 - £95,000 + benefits
London (Hybrid)
We’re looking for an Azure SRE to support and operate high-availability platforms processing tens of thousands of transactions per second.
You’ll work closely with engineering teams to maintain reliable, scalable infrastructure and improve platform performance using SRE practices.
Responsibilities
* Operate and support Azure-based infrastructure
* Manage infrastructure using Terraform
* Maintain and support Kubernetes clusters
* Support distributed platforms including Kafka (or similar messaging tools)
* Define and track SLIs, SLOs and error budgets
* Improve monitoring, alerting and incident response
* Reduce operational toil through automation
* Help improve MTTR and MTTD
Requirements
* Strong experience with Microsoft Azure
* Experience with Terraform / Infrastructure as Code
* Hands-on experience with Kubernetes
* Experience with Kafka or other Messaging tools.
* Good understanding of SRE concepts including SLIs, SLOs, error budgets and toil reduction
* Experience working with high-throughput distributed systems
Azure Site Reliability Engineer
£75,000 - £95,000 + benefits
London (Hybrid)