Jobs
My ads
My job alerts
Sign in
Find a job Career Tips Companies
Find

Site reliability engineer

London
Permanent
Albatross
Site reliability engineer
Posted: 3 February
Offer description

Location: Remote, right to work and travel in Europe.

At Albatross, we’re building the second pillar of AI: a perception layer that understands how users actually experience content, in real time. Trained on live user interactions, Albatross learns and reasons on the fly. Our technology powers real-time, in-session discovery by adapting to evolving user interests, in real-time. We have raised significant funding and our platform already operates at scale, with billions of events being processed and hundreds of millions of predictions served.


The Role

We’re looking for a Site Reliability Engineer to own the reliability and observability of our platform. This is a hands-on leadership role where you’ll design, build, and maintain our observability stack, lead incident response, oversee releases, and establish the processes and standards that allow the team to ship quickly and confidently. More specifically you will:

* Observability & Monitoring: Own and evolve our observability stack (Prometheus, Grafana, Loki, Jaeger), including dashboards, alerts, and SLOs. Instrument services for meaningful metrics and tracing, reducing noise and improving signal.
* Reliability & Incident Response: Lead incident response and establish blameless postmortems, runbooks, and automated remediation. Define, track, and improve SLIs/SLOs to proactively reduce reliability risk.
* Release Management: Own the release process end-to-end, improving deployment speed, safety, and recovery. Implement progressive rollouts, feature flags, and rollback strategies.
* Platform & Tooling: Embed observability into the development lifecycle in close collaboration with engineering. Maintain and evolve our Kubernetes-based platform, adopting new tools when they add real value.


Requirements

* 5–7+ years in SRE, platform engineering, DevOps, or similar roles.
* Strong production experience with Kubernetes and modern observability stacks (Prometheus, Grafana, Loki, Jaeger/OpenTelemetry).
* Proven track record leading incident response and building monitoring systems teams actually use.
* Deep distributed systems knowledge and production debugging experience.
* Pragmatic approach to tooling and alerting that teams trust.
* Clear communicator across engineering, product, and leadership.
* STEM degree (Computer Science, Engineering, Mathematics, or similar).
* Plus: contributions to open-source observability projects and background in high-scale or high-availability environments.


Benefits

* Remote-first, async-friendly culture.
* Ownership and autonomy, you\'ll shape how we do reliability.
* A team that cares about building things right.
#J-18808-Ljbffr

Apply
Create E-mail Alert
Job alert activated
Saved
Save
Similar job
Senior site reliability engineer (sre)
London
Permanent
Lloyds Banking
Site reliability engineer
€80,000 a year
Similar job
Staff site reliability engineer
London
Permanent
Pismo
Site reliability engineer
€80,000 a year
Similar job
Senior site reliability engineer
London
Permanent
ClearScore Technology Limited
Site reliability engineer
€80,000 a year
See more jobs
Similar jobs
Engineering jobs in London
jobs London
jobs Greater London
jobs England
Home > Jobs > Engineering jobs > Site reliability engineer jobs > Site reliability engineer jobs in London > Site Reliability Engineer

About Jobijoba

  • Career Advice
  • Company Reviews

Search for jobs

  • Jobs by Job Title
  • Jobs by Industry
  • Jobs by Company
  • Jobs by Location
  • Jobs by Keywords

Contact / Partnership

  • Contact
  • Publish your job offers on Jobijoba

Legal notice - Terms of Service - Privacy Policy - Manage my cookies - Accessibility: Not compliant

© 2026 Jobijoba - All Rights Reserved

Apply
Create E-mail Alert
Job alert activated
Saved
Save