We’re building a Centralised SRE team to champion reliability engineering across global technology infrastructure. As a Senior Site Reliability Engineer, you’ll be at the forefront of this transformation engineering scalable systems, automating operations, and embedding resilience into every layer of the tech stack.
This isn’t just about keeping the lights on. It’s about rethinking how systems behave under pressure, how teams respond to incidents, and how automation can unlock new levels of performance and efficiency.
What You’ll Do;
* Engineer for Resilience: Design and implement systems that are fault-tolerant, self-healing, and built for scale.
* Automate Everything: Build tools and scripts to eliminate manual toil, streamline operations, and accelerate recovery.
* Lead Incident Response: Own the lifecycle of incidents from detection to resolution to prevention with deep root cause analysis and strategic fixes.
* Drive Performance: Monitor, tune, and optimize systems to ensure peak performance across platforms.
* Partner Across Teams: Collaborate with product, infrastructure, and development teams to embed SRE principles into the software lifecycle.
* Influence Culture: Advocate for reliability-first thinking, mentor engineers, and help shape a culture of technical excellence.
What You Bring;
* Strong coding skills in Python, PowerShell, or Go
* Deep understanding of systems engineering and cloud infrastructure
* Experience with observability, CI/CD, and automation frameworks
* Ability to assess risk, plan resources, and influence long-term technical direction
* Comfortable advising senior stakeholders and shaping cross-functional initiatives
* Leadership & Communication:
* Skilled at guiding teams through change
* Able to inspire adoption of new practices and mindsets