Site reliability engineering specialist

Birmingham (West Midlands)

BT

Engineering

Posted: 17 April

Offer description

What you’ll be doing

1. Executes the implementation of new software development life cycle automation tools, frameworks, and code pipelines (continuous integration/continuous delivery pipelines whilst executing best practices with a focus on the re-use of application code, demonstrates consistent software delivery practices and produces continuous integration/continuous delivery platform solutions
2. Executes the implementation of automation technologies to ensure repeatability, eliminating toil, reducing mean time to detection and resolution and repair services
3. Proactively identifies and manages risk through regular assessment and diligent execution of controls and mitigations, proactively raising any concerns
4. Leads scale testing to measure, tune and optimise system performance
5. Executes metric/monitoring analysis that creates stability, security, and performance improvements
6. Designs, analyses, develops and troubleshoots highly-distributed large-scale production systems spanning on-prem and cloud-based hosting
7. Executes approaches that scale systems sustainably through mechanisms like automation and evolves systems by pushing for changes that improve reliability and velocity
8. Writes and delivers infrastructure as code software to improve the availability, scalability, latency, and efficiency of services
9. Implements robust monitoring and alerting systems and performs root cause analysis and post-mortems with an eye towards future prevention
10. Inspects queue and support processing to ensure early warning of support issues
11. Executes retrospective and preventive actions after each high severity production incident
12. Analyses complex systems from a reliability and resilience perspective and identifies sources of instability in distributed systems
13. Champions, continuously develops and shares with team knowledge on emerging trends and changes in site reliability engineering best practices and industry standards
14. Mentors other site reliability engineers, helping to improve the team’s abilities by acting as a technical resource

The skills you’ll need to succeed

15. Incident Management Ensures that any incidents affecting processes and performances of relevant technology services or systems are managed appropriately to mitigate risk and minimise disruption.
16. Infrastructure Configuration Design, deploy and maintain highly available and safe networks and applications.
17. Continuous Integration / Deployment Build, Deploy and unit testing stages of the software release process into Production.
18. Service Assurance Service-level management involving the monitoring and management of the quality of the key performance indicators (KPIs) of a product or service to provide stable and performant applications to end users.
19. Troubleshooting: Applies problem solving methods to repair failed products or processes.
20. Programming / Scripting Provides automation to ensure repeatability, eliminating toil.
21. System Administration Knowledge of Windows and Linux System Administration
22. Project Management Ability to plan projects, assess risks and opportunities, communicating with stakeholders, troubleshooting problems, and more.
23. Application Performance Monitoring & Alerting Ensures suitable, modern and proactive monitoring and alerting in place to raise and mitigate concerns in system performance before user awareness.

Experience you’d be expected to have

Mandatory:

24. Broad technical experience of Programming and Scripting, e.g BASH, Python and PowerShell
25. Broad technical experience across a range of IT infrastructure disciplines (eg, networks, datacentre infrastructure, operating systems etc)
26. Experience of Continuous Improvement
27. Strong experience of communicating complex detail to technical and non-technical audiences
28. Working with wider programme/LoB delivery organisations

Preferred

29. Team leading experience or an interest in leading a team, further advantageous if this experience is within change transformation
30. Experience in one or more of the following:
31. Application implementation / solution design
32. IT security and compliance
33. Physical Security
34. Datacentre infrastructure
35. Software development
36. Vendor management

Benefits

At BT, we entertain, educate, and empower millions of people every single day. We’re a brand built on connecting people – whether that’s friends, family, businesses, or communities. Working here, you’ll receive an attractive salary and a range of competitive benefits, but – more than that – you’ll be joining an ambitious organisation with a culture of togetherness, collaboration, and inclusivity, that takes a genuine and proactive interest in your progress and development.

37. Competitive salary
38. 10% on target bonus
39. BT Pension scheme, minimum 5% Employee contribution, BT contribution 10%
40. 25 days annual leave (not including bank holidays), increasing with service
41. Huge range of flexible benefits including cycle to work, healthcare, season ticket loan
42. World-class training and development opportunities
43. Option to join BT Shares Saving schemes.
44. Discounted broadband, mobile and TV packages
45. Access to 100’s of retail discounts including the BT shop

See the details

Create E-mail Alert

Save

Similar job

Maintenance engineer (m/f/d)

Darlaston

ZF

Maintenance engineer

Similar job

Cni mechanical shift engineer - coventry

Coventry

Compass Group UK Careers

Shift engineer

Similar job

Wheel repair technician, porsche centre solihull

Solihull

Sytner Group

Repair technician