About the Role
Are you passionate about building resilient systems and eliminating operational toil through automation? We’re looking for a Site Reliability Engineer (SRE) to join our high-impact team and help shape the future of our digital infrastructure.
As an SRE, you’ll blend software engineering with systems engineering to ensure the reliability, availability, and performance of our platforms. You’ll work on mission-critical systems, drive automation at scale, and collaborate across teams to embed reliability into every layer of our technology stack.
What You’ll Do
* Ensure the availability, scalability, and performance of systems through proactive monitoring and capacity planning.
* Lead incident response, root cause analysis, and implement preventive measures to avoid recurrence.
* Develop automation tools and scripts to reduce manual operations and improve system resilience.
* Optimize system performance and resource usage, identifying and resolving bottlenecks.
* Collaborate with development and product teams to integrate SRE best practices into the software lifecycle.
* Contribute to the evolution of our SLIs, SLOs, and error budgets to drive reliability metrics.
* Stay current with industry trends and contribute to our internal engineering communities.
What You Bring
* Proven experience as an SRE, DevOps Engineer, or Systems Engineer in a complex, high-availability environment.
* Deep expertise in Microsoft SQL Server (2016–2022), including performance tuning, high availability, and architecture.
* Strong scripting skills (e.g., PowerShell) and experience with automation/configuration tools like Ansible or Chef.
* Familiarity with observability tools, monitoring frameworks, and incident management practices.
* A mindset focused on eliminating TOIL, improving developer experience, and scaling operations through code.
* Excellent communication and collaboration skills.
Bonus Points
* Experience with cloud platforms (Azure, AWS, or GCP).
* Background in database automation and estate standardization.
* Knowledge of security and compliance in regulated environments.
Why Join Us?
* Work on high-impact systems that power critical business operations.
* Be part of a forward-thinking engineering culture that values innovation, learning, and collaboration.
* Access to cutting-edge tools and technologies.
* Competitive compensation, benefits, and career growth opportunities.