Description
This role is crucial in maturing our SRE capability and contributing to the resiliency, availability and security of our infrastructure and applications.
You will lead training session, nurture a service-oriented culture and champion Site Reliability Engineering. You will play a pivotal role in one of the UK’s largest financial services transformations, working with shared tools and technology to empower colleagues and drive innovation
What you will be doing
· Enhance the resiliency and reliability of platform services and product features.
· Maintain a calm, focused environment to enable effective resolution and protect engineering teams
· Monitor system health, create/manage error budgets, and drive improvements
· Build tools that support service reliability and code quality.
· Ensure incident reports are clear, actionable, and support problem management
· Drive service improvement initiatives including automation, monitoring, and alerting
· Lead innovative initiatives to drive next-generation technology and foster a high-performing, supportive environment.
· Mentor and develop team members, fostering a collaborative and customer-focused culture
· Explore patterns in incident data to identify risks and mitigation strategies
· Contribute to strategic planning and technical decision-making across the ECP Platform
· Work with stakeholders to set and monitor SLO’s and SLI’s to embed these within the wider platform.
Essential skills and experience:
· Good understanding of SRE, with experience in Infrastructure as Code and CI/CD pipelines (Harness, GitHub Actions, Terraform, Ansible).
· Strong leadership and stakeholder management skills
· Knowledge of OCP, GCP and Azure cloud platforms.
· Reliability & Performance Management: Design, implement, and take ownership for SLOs / SLIs for critical platform services.
· Experience with Error Budgets.
· Ability to work under pressure and communicate technical issues clearly to diverse audiences
· Self-motivated with a continuous improvement mindset
· Experience with Dynatrace, Splunk, and modern observability tools
And any experience of these would be useful:
· Knowledge of the payments industry, including schemes, SLAs, and regulatory context
· Proficient knowledge in Cloud Security and Networking
At Lloyds Banking Group, we're driven by a clear purpose; to help Britain prosper. Across the Group, our colleagues are focused on making a difference to customers, businesses and communities. With us you'll have a key role to play in shaping the financial services of the future, whilst the scale and reach of our Group means you'll have many opportunities to learn, grow and develop.
We keep your data safe. So, we'll only ever ask you to provide confidential or sensitive information once you have formally been invited along to an interview or accepted a verbal offer to join us which is when we run our background checks. We'll always explain what we need and why, with any request coming from a trusted Lloyds Banking Group person.
We're focused on creating a values-led culture and are committed to building a workforce which reflects the diversity of the customers and communities we serve. Together we’re building a truly inclusive workplace where all of our colleagues have the opportunity to make a real difference.