I'm partnering with a leading global high-performance technology team to hire a Site Reliability Engineer | Production Infrastructure
THE OPPORTUNITY
Join a cutting-edge engineering environment where systems reliability, automation, and performance are critical to global-scale, low-latency operations.
This is a hands-on SRE + Systems Engineering hybrid role focused on building and operating highly resilient infrastructure platforms.
WHAT YOU'LL BE WORKING ON
Building and evolving large-scale infrastructure automation
Designing observability, monitoring, and alerting systems
Improving reliability and reducing operational toil using SRE principles
Debugging deep system-level performance and latency issues
Managing CI/CD pipelines, containers, and Kubernetes workloads
Supporting global production systems across distributed teams
Driving continuous improvement across tooling and platform stability
WHAT WE'RE LOOKING FOR
5+ years Python and/or Go
Strong Linux experience
Solid CS fundamentals (data structures, algorithms)
Experience with monitoring, metrics, or statistical analysis
CI/CD and Git experience
Background in infrastructure, platform engineering, or SRE-type roles
Strong ownership mindset in production-critical environments
WHY THIS ROLE?
You'll be working in a deeply technical, research-driven environment where engineering quality, speed, and reliability directly impact global-scale systems.
High autonomy. High impact. High performance.