Role Overview
The Global Analytics team is responsible for developing and maintaining Price Discovery solutions used by the Front Office to generate and disseminate market information to clients. This data and associated financial calculations are integrated into a range of applications across the firm. As the Site Reliability Engineer, you will play a critical role in ensuring the availability, reliability, and performance of our production environment applications bridging the gap between the software and operations engineering teams.
Role Responsibilities:
1. Ensure uptime, availability, and performance of Global Analytics services
2. Define and monitor Service Level Objectives (SLOs) and Service Level Indicators (SLIs)
3. Respond to incidents and outages working with the Software and Operations engineering teams to quickly resolve
4. Respond to application and infrastructure alerts to prevent service disruption
5. Work with the Software Engineering team to reduce repetitive tasks such as deployments and monitoring
6. Build and maintain internal tools to improve developer productivity
7. Implement and maintain logging, metrics and tracing systems with alignment to Global Architecture best practices
8. Plan for scaling capacity, forecasting future infrastructure needs
9. Ensure compliance with departmental policies (i.e. change management, IT security standards, release management, incident management)
10. Collaborate with Software Engineering team to maintain and improve continuous integration and deployment pipelines
11. Collaborate with QA team to ensure safe and reliable software releases
12. Ensure that systems are secure and satisfy compliance requirements to meet industry standards and regulatory requirements
Experience / Competences:
13. Educated to degree level or equivalent combination of education and experience
14. Solid experience working with financial trading systems
15. Good understanding of high-level Networking systems (e.g. firewalls, load-balancers, etc.)
16. Experience working with cloud platforms, preferably AWS, with Kubernetes and Docker
17. Experience working with monitoring and observability tools such as Grafana and Prometheus
18. Knowledge of CI/CD pipeline tools such as Gitlab and Infrastructure as Code (IaC) tools like Terraform
19. Scripting and Automation experience, ideally with Python and PowerShell
20. Experience of application performance profiling tools
21. Highly analytical, focus on long-term results and delivery
Job Band & Level
22. Professional / Level 5
#LI-ASO #LI-Hybrid