Location: Bournemouth, UK
Onsite: 5 Days/Week
Employment type: Permanent
We are seeking a Site Reliability Engineer (SRE) to design, build, and maintain highly available, resilient, and scalable systems. You will collaborate closely with engineering, product, and operations teams to ensure our Java/Spring Boot applications run smoothly 24/7 in a cloud environment. Additionally, you will drive the adoption of analytics and data‑driven insights to optimize system performance and extract value from operational data.
Key Responsibilities
* Reliability & Scalability: Design, implement, and maintain systems that are robust, scalable, and highly available, supporting millions of daily transactions.
* Cloud Migration: Lead and support migration of applications and infrastructure to public cloud platforms, ensuring best practices in security, reliability, and cost management.
* Automation & Infrastructure as Code: Develop and maintain automation scripts and infrastructure using Kubernetes and Terraform.
* Monitoring & Incident Response: Build and enhance monitoring, alerting, and observability solutions. Respond to incidents, perform root cause analysis, and drive continuous improvement.
* Collaboration: Partner with software engineers, product managers, and business stakeholders to deliver solutions that meet business needs and operational requirements.
* Analytics & Data Insights: Leverage cloud‑based analytics tools to monitor system health, optimize performance, and extract actionable insights.
* Continuous Improvement: Identify and implement opportunities to improve reliability, efficiency, and scalability of the platform.
Required Qualifications
* Proven experience as a Site Reliability Engineer, DevOps Engineer, or similar role supporting large‑scale, mission‑critical systems.
* Strong hands‑on experience with Kubernetes and Terraform.
* Experience deploying and operating applications in public cloud environments (AWS, Azure, GCP).
* Solid understanding of Java and Spring Boot applications.
* Experience with monitoring, logging, and observability tools (Prometheus, Grafana, ELK, Splunk).
* Strong troubleshooting and problem‑solving skills.
* Excellent communication and collaboration skills.
Preferred Qualifications
* Experience in financial services or payments/transaction processing environments.
* Familiarity with cloud‑based analytics platforms and data engineering concepts.
* Experience with CI/CD pipelines and automation tools (Jenkins, GitHub Actions).
* Knowledge of security best practices in cloud environments.
Seniority level
Mid‑Senior level
Job function
Information Technology
#J-18808-Ljbffr