Requirements
* Ideally, already a SRE/DevOps Engineer supporting Kubernetes clusters (EKS preferred) on AWS and extensive experience with Disaster Recovery and Business Continuity.
* (Desirable) Leadership/management of other SRE/DevOps engineers.
* Experience with managing AWS Infrastructure with Terraform on enterprise systems.
* A good understanding of AWS IAM, Roles, Policies and Permissions
* Kubernetes RBAC and IRSA
* Observability with Prometheus / Grafana / Alert Manager / Cloud-Watch
* A good command of a programming language, comfortable with source control and pull requests on Github
* Linux expertise - ideally with experience of cloud-init and ansible
* CI with Bitbucket pipelines / CD with ArgoCD / Helmcharts
What the job involves
* We are looking for someone to join our Site Reliability and Devops team, and play a key role in supporting our Disaster Recovery, Business Continuity and Infosec initiatives going forward
* Setting up and managing Disaster Recovery and Business Continuity processes.
* Helping manage penetration testing and certification processes.
#J-18808-Ljbffr