Site Reliability Engineer – FinTech / Global Payments – London HQ / Remote First
Location - This UK-based team offers a fully remote working option, with a headquarters in Central London.
In this role, you will be joining a leading SaaS FinTech scale-up, setting a new industry standard within their market. The business aims to scale its platform significantly over the next few years to support a growing international client base.
Responsibilities
1. Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery.
2. Refine KPIs to support data-driven decisions around reliability and availability.
3. Monitor systems to ensure optimal performance, cost-efficiency, and capacity planning.
4. Collaborate with dev teams to build resilient, observable, and maintainable features.
5. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution.
Skills
* Strong grounding in SRE principles and operational best practices.
* Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines.
* Solid programming skills in Python and/or Go; Java experience a plus.
* Strong Linux and networking fundamentals (TCP, DNS, TLS, HTTP).
* Familiarity with IaC (Terraform), CI/CD (GitHub Actions, Jenkins), and Agile workflows.
* Experience with containerisation (Docker, Kubernetes) and stream processing (Kafka a plus).
* Annual Bonus (10%)
* 30 days holiday + bank holidays
* Life Assurance
If this is of interest, please feel free to apply for a confidential discussion or reach me on [emailprotected]
#J-18808-Ljbffr