Please note: It’s a fully onsite role (5 days a week in the office)
We’re looking for an experienced Sr. DevOps/Site Reliability Engineer to build and optimize scalable, resilient cloud infrastructure. You’ll partner with development teams to improve automation and CI/CD, while also owning incident response and system reliability. This includes monitoring, troubleshooting, and ensuring our services remain highly available and performant.
A day in life of our Sr. DevOps/SRE:
* Respond to monitoring alerts, participate in incident calls, and guide them to resolution.
* Collaborate with software development teams to facilitate their daily operations.
* Design, configure, and optimize CI/CD pipelines.
* Build, monitor, and maintain a resilient and scalable infrastructure.
* Maintain documentation for processes, architectures, and configurations.
Qualifications
Who we are looking for:
* Strong analytical and troubleshooting skills.
* Hands‑on experience with AWS CloudOps.
* Understanding of cloud security best practices and industry standards.
* Participate in an on‑call rotation schedule.
* Minimum of 7 years in a DevOps / SRE role.
* 7 years working with Linux and Windows systems.
* 3 years of advanced knowledge in Terraform module development.
* 3 years of production experience with Docker and Kubernetes (EKS).
* 5 years expertise in AWS services (EC2, RDS, S3, ElastiCache, WAF, CDN, Route 53).
* Experience in cloud networking (Transit Gateway, subnets, routing, security groups).
* Strong knowledge of Jenkins and GitLab.
* Hands‑on experience configuring IIS, NGINX, or other web servers.
* Proficient with monitoring solutions (Zabbix, Prometheus, Grafana, etc.).
#J-18808-Ljbffr