Site Reliability Engineer / DevOps Engineer (Azure)
Opportunity to join one of the top UK Insurers who are on a mission to become the leading ‘digital first' insurer in the UK. As a Site Reliability Engineer, you will be the backbone of their Azure environment, ensuring it's * *scalability, reliability, and operational excellence * *. You will work closely with cross-functional teams to build and maintain a robust infrastructure that supports their dynamic needs.
Key Responsibilities:
* Assume responsibility for the observability suite, encompassing tools for monitoring, logging, and alerting, to guarantee a thorough and integrated understanding of system functionality and health.
* Set up and oversee APM tools like Dynatrace or New Relic, leveraging their features to effectively monitor application performance and resolve problems.
* Employ extensive DevOps expertise to establish and uphold infrastructure as code (IaC) methodologies, streamlining the processes of deployment, scaling, and management through automation.
* Actively track and pinpoint issues related to performance and reliability in APIs and applications, and devise strategies to address these concerns.
* Work in tandem with development teams to fine-tune application performance, enhance the efficiency of resource use, and improve scalability.
* Develop and sustain comprehensive incident response and review protocols to reduce system downtime and avert the repetition of problems.
* Propel ongoing enhancement efforts to boost the dependability, scalability, and operational efficiency of Ageas' infrastructure and services, staying ahead of client expectations.
* Engage in the on-call schedule, offering support for resolving incidents and conducting necessary troubleshooting.
Qualifications :
* Experience in a DevOps / Site Reliability Engineer ( SRE ) position, dedicated to ensuring the high availability, reliability, and scalability of live systems.
* Proficient in observability tools like Prometheus, ELK stack, Grafana, and Azure Monitor, capable of fully managing the suite for optimal system oversight.
* Skilled in operating APM tools such as Dynatrace or New Relic, with a track record of using these tools to effectively monitor and enhance application performance.
* A thorough grasp of DevOps methodologies, including the use of Terraform for infrastructure as code (IaC), and expertise in automated deployment and configuration management.
* Hands-on experience with programming environments such as Node.js, Java, and various JavaScript frameworks.
* Familiarity with cloud platforms, especially Azure, and adept at administering cloud-based infrastructures.
* Demonstrated ability to anticipate and rectify issues impacting the performance and reliability of APIs and applications.
* Excellent teamwork and communication abilities, ensuring productive collaboration with diverse functional groups.
Remote based.
Paying up to 75k, depending on experience.
#J-18808-Ljbffr