Site Reliability Engineer (SRE) at Barclays
Join us at Barclays as a Site Reliability Engineer (SRE). We’re looking for someone to help design, develop, and enhance software that powers critical business, platform, and technology capabilities for our customers and colleagues.
Key Responsibilities
* Platform resiliency & capacity management for clusters and platforms (Kubernetes/OpenShift): SLOs, error budgets, autoscaling, quotas, node pools, capacity planning.
* AWS platforms including Lambda and cost optimisation/resource management (EKS, EC2, VPC, IAM, budgets, rightsizing, scaling policies).
* Observability & incident response with automation: monitoring, alerting, tracing, on-call, post‑mortems; Python / Shell for runbooks and auto‑remediation.
* Performance & load engineering and capacity modelling.
* Chaos/DR testing and reliability patterns: circuit breakers, bulkheads, retries/back‑off.
* FinOps tooling familiarity: cost explorer/curation, anomaly detection, utilisation dashboards.
Qualifications
* Experience designing and improving software using industry‑aligned programming languages, frameworks, and tools.
* Strong cross‑functional collaboration with product managers, designers, and engineers.
* Knowledge of secure coding practices, unit testing, and code quality.
* Understanding of industry technology trends and ability to contribute to technical communities.
* Experience with incident management, monitoring, alerting, and on‑call cadence.
Accountabilities and Leadership Expectations
* Development and delivery of high‑quality software solutions with scalable, maintainable code.
* Cross‑functional collaboration to define requirements, devise solutions, and align with business objectives.
* Participation in code reviews and promotion of a culture of code quality and knowledge sharing.
* Staying informed of industry trends and contributing to organisational technology communities.
* Adherence to secure coding practices to mitigate vulnerabilities and protect sensitive data.
* Implementation of effective unit testing practices to ensure code reliability.
* Ability to advise and influence decision‑making, contribute to policy development, and take responsibility for operational effectiveness.
* Leadership of a team to deliver complex tasks, set objectives, coach employees, and drive performance.
* For individual contributors, leadership of collaborative assignments, guidance of team members, and identification of cross‑functional expertise.
* Consultation on complex issues and provision of advice to people leaders.
* Risk mitigation, policy development, and strengthening of controls related to work performed.
* Engagement with other areas to align support with business strategy.
* Complex analysis of data from multiple sources to solve problems creatively and effectively.
* Clear communication of complex information to diverse audiences.
* Influencing stakeholders to achieve outcomes.
All colleagues are expected to demonstrate the Barclays values of Respect, Integrity, Service, Excellence and Stewardship and the Barclays mindset of Empower, Challenge and Drive.
This role is based in Knutsford.
#J-18808-Ljbffr