Site reliability engineer

London

Stott & May

Posted: 11 June

Offer description

Site Reliability Engineer (DevOps)

*UK Enhanced DV clearance essential*

Start: ASAP

Duration: initial 12-month contract

Pay: inside IR35, negotiable

Location: full time on site in central London (5-days in office)

Role Description:

In this role you’ll be at the forefront of delivering enhanced reliability, performance, and quality to a key national security customer. Joining a growing team, you’ll help create a culture of continuous improvement and play a pivotal role in revolutionising how systems are developed and supported. This role combines operational support with software engineering, allowing you to design tools and applications that monitor and improve system health. As part of a wider programme, you'll be integral to supporting the customer's critical mission.

Key Responsibilities:

* Support and maintain critical services, enhancing the availability, performance, and stability of core mission applications.
* Participate in the 24/7 on-call rota (one week in 5 with overtime rate TBC), supporting production systems outside business hours, with additional on-call allowances and overtime benefits.
* Focus on automation to reduce manual operations work (e.g. incident tickets, on-call) to improve efficiency.
* Collaborate with development teams, advising on best practices for system design and implementation.
* Design and deploy monitoring tools to provide intelligent insights into system health, customising tools where necessary.
* Understand the relationship between software and infrastructure, ensuring systems are scalable and resilient to failure.
* Participate in the wider DevOps/SRE community, sharing knowledge and best practices across the organisation.

Key Skills & Experience:

* Experience or enthusiasm for software development in web technologies and object-oriented programming.
* Familiarity with database technologies such as Oracle SQL, MongoDB, or Postgres.
* Proficiency with Linux and Windows command lines (e.g. Bash, PowerShell).
* Experience with monitoring large systems using tools like Grafana, Prometheus, ELK, and Splunk.
* Knowledge of Agile methodologies and tools like Atlassian.
* Strong troubleshooting skills across various levels of the application stack.
* Familiarity with ITIL processes.
* Experience with microservices architectures and container platforms like Docker, Kubernetes, and OpenShift.
* A passion for learning new technologies and solving complex problems.
* Awareness of emerging tech trends and tools in the SRE space.

Interested in this role? Please apply directly to this advert with an updated CV to be considered for the role.

Apply

Create E-mail Alert

Save

Similar job

Senior site reliability engineer

London

Stott & May Professional Search Limited

Site reliability engineer

Similar job

Site reliability engineer - up to 150k base salary - fully remote

London

Permanent

Halian Technology Limited

Site reliability engineer

£100,000 a year

Similar job

Sr. site reliability engineer (kubernetes)

London

VeeAR Projects Inc.

Site reliability engineer