Site reliability engineer

London

TP ICAP

Posted: 29 August

Offer description

Role Overview

The Global Analytics team is responsible for developing and maintaining Price Discovery solutions used by the Front Office to generate and disseminate market information to clients. This data and associated financial calculations are integrated into a range of applications across the firm. As the Site Reliability Engineer, you will play a critical role in ensuring the availability, reliability, and performance of our production environment applications bridging the gap between the software and operations engineering teams.

Role Responsibilities:

1. Ensure uptime, availability, and performance of Global Analytics services

2. Define and monitor Service Level Objectives (SLOs) and Service Level Indicators (SLIs)

3. Respond to incidents and outages working with the Software and Operations engineering teams to quickly resolve

4. Respond to application and infrastructure alerts to prevent service disruption

5. Work with the Software Engineering team to reduce repetitive tasks such as deployments and monitoring

6. Build and maintain internal tools to improve developer productivity

7. Implement and maintain logging, metrics and tracing systems with alignment to Global Architecture best practices

8. Plan for scaling capacity, forecasting future infrastructure needs

9. Ensure compliance with departmental policies (i.e. change management, IT security standards, release management, incident management)

10. Collaborate with Software Engineering team to maintain and improve continuous integration and deployment pipelines

11. Collaborate with QA team to ensure safe and reliable software releases

12. Ensure that systems are secure and satisfy compliance requirements to meet industry standards and regulatory requirements

Experience / Competences:

13. Educated to degree level or equivalent combination of education and experience

14. Solid experience working with financial trading systems

15. Good understanding of high-level Networking systems (e.g. firewalls, load-balancers, etc.)

16. Experience working with cloud platforms, preferably AWS, with Kubernetes and Docker

17. Experience working with monitoring and observability tools such as Grafana and Prometheus

18. Knowledge of CI/CD pipeline tools such as Gitlab and Infrastructure as Code (IaC) tools like Terraform

19. Scripting and Automation experience, ideally with Python and PowerShell

20. Experience of application performance profiling tools

21. Highly analytical, focus on long-term results and delivery

Job Band & Level

22. Professional / Level 5

#LI-ASO #LI-Hybrid

Apply

Create E-mail Alert

Save

Similar job

Site reliability engineer (security cleared)

London

Profile 29

Site reliability engineer

£65,000 a year

Similar job

Site reliability engineer - government digital service - g7

London

Manchester Digital

Site reliability engineer

€67,000 a year

Similar job

Site reliability engineer (sre)

London

慨正橡扯

Site reliability engineer

€75,000 a year