Site reliability engineer

Glasgow (Glasgow City)

FBI &TMT

Posted: 2 February

Offer description

Role Summary

As a Site Reliability Engineer (SRE) for our Data Platform, you will be the guardian of our mission-critical data infrastructure. You will bridge the gap between software engineering and systems operations to ensure our cloud-native environment-built on AWS, Snowflake, and Databricks-is scalable, resilient, and highly available. Your mission is to treat operations as an engineering problem, using automation to eliminate toil and driving a 'reliability-first' culture across our data ecosystem.

Key Responsibilities

1.
Infrastructure as Code (IaC): Design and maintain automated provisioning and configuration management for AWS and data platform components using Terraform or CDK.
2.
Resiliency & Disaster Recovery: Lead the strategy for high availability. You will design and execute DR drills, failure-mode testing, and recovery validation to ensure data integrity during outages.
3.
Reliability Engineering: Define and monitor SLIs, SLOs, and SLAs. You will manage error budgets to balance the velocity of data engineering with the stability of the platform.
4.
Observability: Implement comprehensive monitoring, logging, and tracing (using tools like CloudWatch, Datadog, or Grafana) to provide deep visibility...

Apply

Create E-mail Alert

Save

Similar job

Site reliability engineer (automation & observability)

Glasgow (North Lanarkshire)

Networking People

Site reliability engineer

€400 - €450 a day

Similar job

Site reliability engineer (automation & observability)

Glasgow (Glasgow City)

Networking People

Site reliability engineer

€400 - €450 a day

Similar job

Site reliability engineer iii

Glasgow (Glasgow City)

Permanent

JPMorganChase

Site reliability engineer

€70,000 a year