Responsibilities
* Design, implement, and manage CI/CD pipelines for automated builds, testing, and deployments.
* Maintain and optimize infrastructure as code (IaC) using tools like Terraform, Ansible, or CloudFormation.
* Manage cloud infrastructure (AWS, Azure, or GCP) for high availability and scalability.
* Implement and monitor container orchestration platforms (Kubernetes, Docker, EKS, AKS, GKE).
* Ensure system reliability through logging, monitoring, and alerting solutions (Prometheus, Grafana, ELK/EFK, CloudWatch).
* Drive automation initiatives to reduce manual effort and improve system efficiency.
* Collaborate with development, QA, and security teams to enable DevSecOps practices.
* Troubleshoot production issues, perform root cause analysis, and apply permanent fixes.
* Contribute to disaster recovery and backup planning.
* Mentor junior engineers and share best practices within the team.
Required Skills & Qualifications
* 8–12 years of proven experience in DevOps, Site Reliability Engineering, or Cloud Infrastructure roles.
* Strong expertise in cloud platforms: AWS / Azure / GCP.
* Hands-on experience with CI/CD tools: Jenkins, GitLab CI, GitHub Actions, ArgoCD, etc.
* Proficiency in Infrastructure as Code (IaC): Terraform, Ansible, Helm.
* Solid understanding of Kubernetes & containerization.
* Knowledge of networking, load balancing, security, and firewalls in cloud environments.
* Expertise in monitoring, logging, and observability.
* Scripting/programming skills in Python, Bash, or Go.
* Familiarity with DevSecOps practices and security compliance standards.
#J-18808-Ljbffr