Job Title: Senior Site Reliability Engineer (SRE)
Location: London, UK – Onsite (5 days/week)
Employment Type: Permanent
Salary: Up to £80,000 per annum (Gross)
About the Role:
We are seeking a highly skilled and motivated Site Reliability Engineer (SRE) to join our London-based team. This role is ideal for someone passionate about service reliability, scalability, and performance. As an SRE, you will collaborate with development and operations teams to automate infrastructure, enhance observability, and reduce manual processes (TOIL) to improve overall system health.
Key Responsibilities:
* Design, build, and maintain scalable, resilient systems and services.
* Automate routine tasks and eliminate manual effort using scripting and infrastructure-as-code.
* Collaborate with development teams to ensure best practices for deployment, monitoring, and performance tuning.
* Drive incident management processes, root cause analysis, and continuous improvement of system reliability.
* Maintain and improve observability using monitoring and logging tools.
* Optimize cloud infrastructure usage and costs.
Primary Skills & Experience:
* Strong hands-on experience with cloud platforms, especially AWS (experience with GCP or Azure is a plus).
* Deep understanding of Container Orchestration technologies such as Kubernetes and Docker.
* Proficiency in monitoring and logging tools including: Datadog, Splunk, Dynatrace, AppDynamics, Prometheus, Grafana, ELK Stack, CloudWatch, Gremlin, ThousandEyes.
* Experience with Terraform, Jenkins, GitLab CI, PostgreSQL, Redis, and Kong API Gateway.
* Solid understanding of networking, security best practices, and infrastructure automation.
* Exposure to AWS ECS, Atlas, and internal tooling integrations.
* Diagramming and documentation skills using Lucidchart and PlantUML.
Secondary Skills:
* Familiarity with ServiceNow (SNOW) and JIRA for incident and task tracking.
* Competency in Shell scripting, Linux system administration, Bitbucket, and Akamai.
* Experience working within DevOps pipelines and CI/CD frameworks.
Qualifications:
* Bachelor’s degree in Computer Science, Engineering, or a related technical field (or equivalent practical experience).
* 8+ years of relevant experience in SRE, DevOps, or Infrastructure Engineering roles.