A global trading business is looking for a talented Trading Production Engineer to join there high-performing technology team in New York. In this role, you will be responsible for ensuring the stability, performance, and reliability of mission-critical trading systems operating in a fast-paced, real-time environment.
You will work closely with traders, developers, and infrastructure teams to support and enhance systems that demand extremely high availability and low latency.
Key Responsibilities
* Maintain and support production trading systems with a focus on uptime, resilience, and performance
* Monitor system health, respond to incidents, and perform root cause analysis
* Collaborate with development teams to improve system reliability and release processes
* Automate operational tasks and build tools to enhance system observability
* Manage deployments, releases, and change processes in production environments
* Optimise system performance, including latency and throughput improvements
* Implement and maintain monitoring, alerting, and logging solutions
* Participate in on-call rotation and provide out-of-hours support when required
Required Skills & Experience
* Proven experience in a Production Engineer, Site Reliability Engineer, or similar role
* Strong Linux/Unix systems knowledge
* Proficiency in Python
* Experience with monitoring and alerting tools (e.g., Prometheus, Grafana, ELK stack)
* Familiarity with CI/CD pipelines and deployment tooling
* Solid understanding of networking concepts (TCP/IP, DNS, load balancing)
* Strong troubleshooting skills in complex, distributed systems
* Ability to work effectively under pressure in a fast-paced environment
* Experience in financial services, trading, or low-latency environments
* Knowledge of high-frequency trading systems or real-time data pipelines
* Experience with containerisation and orchestration (Docker, Kubernetes)
* Understanding of incident management and post-mortem practices
#J-18808-Ljbffr