Senior Site Reliability Engineer (SRE) UK Remote Permanent | Up to £120,000 | Fully Remote (UK Only) This Is NOT a DevOps Role Real SRE Work Only Were looking for a true Senior Site Reliability Engineer with deep incident management experience, strong operational ownership, and expert Linux/AWS troubleshooting skills. This role is focused entirely on reliability, availability, incident response, and systems engineering not building CI/CD pipelines or acting as DevOps by another name. Leadership Requirement Small Team Technical Lead You must have experience leading a small engineering team (25 people), defining technical direction, improving on-call processes, and owning reliability strategy. This is a hands-on role with real SRE leadership not people management. About the Role As a Senior SRE, you will own the reliability, resilience, and operational health of large-scale AWS/Linux systems. Youll join an engineering organisation where SRE principles are fully embedded, respected, and treated as a distinct discipline. Key Responsibilities Lead major incidents, mitigation, RCA, and preventative improvements Own and refine SLIs, SLOs, and error budgets Reduce operational toil through automation Deep-dive Linux debugging, performance tuning, and systems analysis Strengthen observability, monitoring, and alerting Provide technical leadership to a small SRE/engineering group Improve and manage on-call processes (PagerDuty, OpsGenie, etc.) Collaborate with development teams to build reliability into system design What Youll Bring Strong AWS experience (EC2, networking, autoscaling, IAM, load balancing) Deep Linux troubleshooting skills (performance, networking, debugging) Real 24/7 production on-call experience Hands-on incident management and postmortems Experience mentoring or leading a small technical team Scripting/automation with Python, Go, or Bash Strong observability skills (Datadog, Prometheus, Grafana, CloudWatch) Why This Role Appeals to Real SREs Youll be solving actual SRE problems: reliability, incidents, resilience, uptime Youll guide a small team through complex engineering challenges
JBRP1_UKTJ