Site Reliability Engineer (SRE) - Payments
London, England, United Kingdom Software and Services
Description
SRE and Engineering Operations Engineers in the team take part in every aspect of the software development lifecycle. We work in a fast-paced environment and are responsible for hands-on coding of critical system components. We have constructive design discussions, learn from each other, and use our experience to guide and teach. We work closely with privacy and security engineering teams to ensure that the products we build go above and beyond on both fronts. We also partner closely with quality and testing teams, and understand that their success is ours as well. Onboarding will be easier for you if you have hands-on experience with Java or another JVM-based language, and experience developing highly available, high throughput, distributed systems. Some other tech that’s relevant to us is workflow orchestration, relational and non-relational databases, message queueing, application container orchestration, and cloud deployment.
Minimum Qualifications
* Production Experience in operationalizing large scale distributed, fault-tolerant, multi-tenant services.
* Excellent code-debugging/optimization, analytical problem solving, and analytical thinking skills.
* Experience building systems both on-premise (data center) and on public cloud (AWS, GCP, or Azure welcome).
* Understanding of core SRE concepts - Monitoring, Alerting, Incident management.
Preferred Qualifications
* Expertise with container platforms (e.g. Docker, or similar).
* Experience in presenting complex technical concepts to both technical and non-technical stakeholders.
* Proven track record of taking ownership and optimally delivering results.
* Strong background in leading multi-functional projects.
* Experience handling large numbers of diverse systems with configuration management systems like Puppet, Chef, Ansible, or Salt.
J-18808-Ljbffr