Join Barclays as a Senior Site Reliability Engineer and become part of our newly formed Core SRE Team. This is a hands-on engineering role where you will design, build, and optimise automation frameworks, observability tools, and incident response mechanisms. This role also involves collaborating across GTIS and CTO, engaging with storage, data, and other product teams. You will act as a trusted advisor, providing strategic guidance and consultative support to help teams improve reliability, scalability, and efficiency.
Proficiency in Programming and Scripting - This includes expertise in languages such as Python, Powershell, or Go, which are essential for automating routine tasks and system deployments.
Incident Management and Troubleshooting - The ability to manage incidents effectively, troubleshoot issues swiftly, and perform root cause analysis to prevent future incidents.
Systems Engineering and Automation - A deep understanding of systems engineering, including operating systems, networking, and cloud infrastructure. Proficiency in automation tools is crucial for maintaining system reliability at scale.
You may be assessed on the key critical skills relevant for success in role, such as risk and controls, change and transformation, business acumen, strategic thinking and digital and technology, as well as job-specific technical skills.
To apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them.
Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning.
Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from recurring.
Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience.
Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning.
Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle, and work closely with other teams to ensure smooth and efficient operations.
Stay informed of industry technology trends and innovations, and actively contribute to the organization's technology communities to foster a culture of technical excellence and growth.
Plan resources, budgets, and policies; manage and maintain policies/processes; deliver continuous improvements and escalate breaches of policies/procedures.
Lead collaborative, multi-year assignments and guide team members through structured assignments, identify the need for the inclusion of other areas of specialisation to complete assignments. Train, guide and coach less experienced specialists and provide information affecting long term profits, organisational risks and strategic decisions.
Manage and mitigate risks through assessment, in support of the control and governance agenda.
Demonstrate leadership and accountability for managing risk and strengthening controls in relation to the work your team does.
In-depth analysis with interpretative thinking will be required to define problems and develop innovative solutions.
All colleagues will be expected to demonstrate the Barclays Values of Respect, Integrity, Service, Excellence and Stewardship – our moral compass, helping us do what we believe is right. They will also be expected to demonstrate the Barclays Mindset – to Empower, Challenge and Drive – the operating manual for how we behave.