Site reliability engineer

York (North Yorkshire)

TECEZE

Posted: 23h ago

Offer description

Job Title: Site Reliability Engineer

Location: Hybrid Remote – London EC2M

Contract (12 months)

Outside IR35

About the Role:

We are partnering with one of the top companies in the mobile industry to hire a Site Reliability Engineer (SRE). In this role, you will collaborate with cross-functional teams to drive the design, development, and delivery of high-performing, scalable, and reliable infrastructure and services. You’ll be responsible for building robust systems, automating operations, and enhancing observability and deployment pipelines for modern cloud-native applications.

Key Responsibilities:

* System Reliability & Performance:
* Maintain and scale critical services and infrastructure. Identify performance bottlenecks and work closely with product engineers to optimize applications.
* Kubernetes Operations:
* Administer, scale, and troubleshoot clusters in GKE, EKS, or other Kubernetes environments.
* Infrastructure as Code (IaC):
* Design and maintain scalable infrastructure using Terraform and automate deployments across public, private, or hybrid clouds (mainly AWS).
* CI/CD Pipeline Enhancement:
* Build and improve robust CI/CD pipelines to support fast and safe deployment cycles.
* Observability & Monitoring:
* Implement code-based instrumentation and telemetry. Ensure systems are observable with tools for logging, metrics, and alerting.
* Automation & Scripting:
* Write tooling and automation scripts in Python, Go, or Rust to reduce toil and manual intervention.
* Storage & Networking:
* Manage and optimise storage services like Amazon S3 or Google Cloud Storage (GCS). Resolve complex networking issues in multi-cloud environments.

Essential Requirements:

* 5+ years of hands-on experience as a Site Reliability Engineer.
* Proven expertise in Kubernetes (GKE/EKS).
* Strong proficiency in Python, Go, or Rust.
* Solid experience with AWS and Infrastructure as Code using Terraform.
* Deep understanding of Linux internals, standard networking protocols, and distributed systems architecture.
* Hands-on experience with automation and performance optimisation.
* Strong knowledge of SRE principles and methodologies.
* Experience with observability tools and telemetry systems.
* Exposure to Google Cloud Platform (GCP).
* Familiarity with hybrid or multi-cloud architecture.
* Experience with service meshes or edge proxies (e.g., Envoy, Istio).
* Working knowledge of container security best practices.

Apply

Create E-mail Alert

Save

Similar job

Senior site reliability engineer

York (North Yorkshire)

Maxwell Bond

Site reliability engineer

Similar job

Site reliability engineer

York (North Yorkshire)

Halian

Site reliability engineer

Similar job

Site reliability engineer

York (North Yorkshire)

Spectrum It Recruitment (South)

Site reliability engineer