Role: Site Reliability Engineer
Location: Manchester (on-site / secure environments)
Clearance: SC required | DV preferred
Employment: Permanent or Contract
Salary/Rate: Competitive + project allowances (DOE)
Overview
We are seeking a Site Reliability Engineer (SRE) to support mission-critical platforms within a secure, high-assurance environment. This role focuses on reliability, scalability, automation, and operational resilience across complex infrastructure and cloud-enabled services.
You will work within a collaborative engineering team ensuring systems remain secure, performant, and highly available to support critical national infrastructure and defence programmes.
Key Responsibilities
* Maintain and improve reliability, availability, and performance of critical services
* Implement monitoring, alerting, and observability solutions
* Automate infrastructure provisioning and operational workflows
* Support incident response, root cause analysis, and post-incident reviews
* Improve system resilience through fault tolerance and self-healing design
* Collaborate with DevOps, platform, and security teams to enhance service stability
* Maintain documentation, runbooks, and operational procedures
* Ensure systems meet security and compliance requirements
Technical Environment
Infrastructure & Cloud
* Linux systems administration
* AWS, Azure, or private cloud environments
* Virtualisation and container platforms
Automation & Infrastructure as Code
* Terraform, Ansible, Puppet, or similar
* CI/CD tooling (GitLab CI, Jenkins, Azure DevOps)
Containers & Orchestration
* Docker & Kubernetes
* Container security & runtime reliability
Observability & Monitoring
* Prometheus, Grafana, ELK stack, Splunk, or similar
* Logging, metrics, tracing & alerting strategies
Reliability & Performance
* High availability design & scaling strategies
* Load balancing & traffic management
* Performance tuning & capacity planning
Essential Requirements
* Active SC clearance (minimum) or eligibility
* DV clearance highly desirable
* Experience supporting production environments in secure or regulated sectors
* Strong troubleshooting and incident management skills
* Ability to work on-site within secure facilities
Desirable Experience
* Experience in defence, government, or critical infrastructure environments
* Knowledge of security hardening & compliance frameworks
* Scripting skills (Python, Bash, PowerShell)
* Understanding of Zero Trust and secure architecture principles
Why Join?
* Work on nationally significant, high-impact programmes
* Access to complex engineering challenges in secure environments
* Collaborative teams focused on engineering excellence
* Long-term programme stability and career development
Apply now to support secure, mission-critical systems that underpin national capability.
#J-18808-Ljbffr