NETbuilder is a leading provider of digital solutions, software, consulting, and managed services. We work across multiple sectors, with specialist expertise in the financial, government and commercial markets. Since 1999, we have been providing end-to-end solutions across Digital Delivery, Development and Technology. At our core, we are a Digital Transformation consultancy with capabilities for onsite, onshore, and offshore. You will join a world class team of experienced, successful consultants and given the full support, training and mentoring to break into the world of corporate and government consultancy. We’re recruiting for a proactive and technically skilled Site Reliability Engineer (SRE) with a strong automation mindset and previous DevOps experience to join our customers team. This is a hands-on role, supporting critical customer platforms and driving the development and maintenance of automation solutions. Your Responsibilities: Provide 2nd and 3rd line support for critical platforms and automation tools. Troubleshoot and resolve complex system and application issues in production environments Design, implement, and maintain automation processes to improve operational efficiency and system reliability Collaborate closely with development and operations teams to ensure seamless platform integration and performance Maintain and support automation products used by our customers Evaluate readiness of services for production deployment (monitoring, alerting, logging, performance, failover, etc.) Work with development teams to identify and remediate gaps in production readiness Your Technical Skills: Red Hat OpenShift – Helm Charts, YAML, ISTIO (Service Mesh), GitOps Linux – Strong command-line and system administration skills Python – For scripting, automation, and workflow orchestration Terraform & HashiCorp Vault – Infrastructure as Code and secrets management MongoDB – Key-value / NoSQL database knowledge GCP (Google Cloud Platform) – Experience managing and supporting workloads in GCP (Optional/Desirable) : Apache Airflow – Workflow orchestration using Python/SQL Apache Kafka – Event streaming with Java, Python, or Scala Cisco NSO – YANG modelling, service development (nice to have, but not essential) Your Experience: Proven background in a Site Reliability Engineering or DevOps role Experience supporting production environments and delivering operational excellence A track record of automating manual processes and implementing scalable infrastructure solutions Comfortable working in fast-paced, collaborative environments with a strong focus on reliability and performance Excellent troubleshooting, analytical, and communication skills Your Soft Skills: Problem-solving mindset – Ability to remain calm under pressure and logically approach complex issues Collaboration & teamwork – Comfortable working with cross-functional teams Communication – Clear and concise communicator, capable of explaining technical details to both technical and non-technical stakeholders Proactiveness – Takes initiative to identify issues before they become problems and is constantly looking for opportunities to improve systems and processes Adaptability – Able to thrive in dynamic environments, switch contexts effectively, and manage multiple priorities Attention to detail – Careful when working on mission-critical systems where minor misconfigurations can have major impacts Customer focus – Keeps end-user reliability and performance top of mind, especially when developing or supporting automation solutions We welcome talent at all career stages and are dedicated to understanding and supporting additional needs. We're proud to be an equal opportunity employer, committed to creating an inclusive and open environment for everyone.