Lead Site Reliability Engineer (SRE)
Permanent
South West
Hybrid (2 Days On-Site)
We're working with a reputable financial services organisation looking to hire an Lead SRE to drive resilience across their mature cloud estate. This is a
hands-on role
focused on disaster recovery, resilience testing, and onboarding systems into
AWS Resilience Hub
- with a strong emphasis on working closely with application and infrastructure teams.
Key Responsibilities:
* Lead the design and execution of disaster recovery strategy
* Support the rollout and adoption of AWS Resilience Hub across business-critical systems
* Deliver resilience automation initiatives and tooling
* Run or support chaos engineering exercises to validate fault tolerance
* Strengthen business continuity capabilities in a regulated environment
* Collaborate with application managers and stakeholders across engineering and infrastructure
Key Experience:
* Strong experience across AWS services (e.g. EC2, Lambda, S3, VPC, CloudFormation, etc.)
* Deep understanding of resilience in cloud-based infrastructure
* Hands-on involvement in disaster recovery planning and testing
* Exposure to AWS Resilience Hub
* Familiar with chaos engineering practices
* Confident working across technical and business teams
The Role Offers:
* Hybrid working – 2 days per week on-site in the South West (commutable from Bristol, Swindon, Gloucester)
* Private medical, enhanced pension, 28 days holiday, and more
If you're passionate about cloud resilience and want to take ownership of a critical function within a mature AWS estate, we'd love to hear from you.