Your Job Title: Site Reliability Engineer
Location: UK - Newcastle
Your Business Sector: AECO - Architects, Engineers, Construction & Owners
What You Will Do
We are seeking a skilled and motivated Site Reliability Engineer to join our team in Trimble’s Project Delivery Cloud Platform and take responsibility for the infrastructure of our cutting-edge reality capture solution running on Microsoft Azure. The ideal candidate will have a strong background in cloud platforms, infrastructure as code, and automation via programming/scripting languages. You will work with a distributed team to drive the reliability, scalability, and security of the service and infrastructure.
Key Responsibilities
* Develop and maintain infrastructure as code (IaC) using Terraform to ensure reliable and scalable cloud environments;
* Implement and enhance observability solutions using tools like New Relic, DataDog, Sumologic and Splunk for monitoring, logging, and alerting;
* Perform code deployments and manage CI/CD pipelines using Jenkins, Github, and related tooling to ensure smooth and efficient delivery processes;
* Automate routine tasks and workflows to increase operational efficiency and reduce manual intervention;
* Evaluate system designs and architectures for reliability, performance, security, and efficiency, ensuring best practices are followed;
* Lead incident response efforts, conduct root cause analysis, and implement long-term solutions for complex issues;
* Develop and maintain comprehensive runbooks and procedures for incident response and operational tasks;
* Collaborate with cross-functional teams to review and provide feedback on technical designs, ensuring alignment with SRE principles;
* Participate in on-call rotations and handle critical incidents with confidence and expertise;
* Continuously improve documentation for systems and services, contributing to a knowledge-sharing culture within the team.
What Skills & Experience You Should Bring
* Bachelor’s or Master’s degree in Computer Engineering or a related field;
* At least 5 years of technical experience with a proven ability to take ownership;
* Strong collaboration skills with leading cross-functional work;
* Demonstrated success in managing infrastructure in production environments;
* Expertise in capacity planning and cost optimisation for efficient operations;
* Extensive experience with Cloud provider hosted infrastructure (Amazon Web Services & Azure);
* Proficient in high-level scripting languages (Python) and Infrastructure as Code (IaC) tools (Terraform), along with containerisation;
* Experience with Kubernetes or other containerisation technologies;
* Familiarity with CI/CD pipelines and tools such as Azure DevOps, Jenkins, Argo CD, Helm, GitHub;
* Experience with monitoring tools and incident management processes like;
* Prometheus, Grafana, New Relic, DataDog, Splunk, Cloudwatch, Sumologic etc.
* Strong understanding of networking and security concepts;
Additional experience preferred in
* SRE observability experience with NewRelic or Datadog;
* OpenTelemetry;
* AIOps/MLOps;
* SecOps.
How to Apply: Please submit an online application for this position by clicking on the ‘Apply Now’ button located in this posting.
Join a Values-Driven Team: Belong, Grow, Innovate.
At Trimble, our core values of Belong, Grow, and Innovate aren't just words—they're the foundation of our culture. We foster an environment where you are seen, heard, and valued (Belong); where you have an opportunity to build a career and drive our collective growth (Grow); and where your innovative ideas shape the future (Innovate). We believe in empowering local teams to create impactful strategies, ensuring our global vision resonates with every individual. Become part of a team where your contributions truly matter.
Trimble’s Privacy Policy
If you need assistance or would like to request an accommodation in connection with the application process, please contact AskPX@px.trimble.com.
#J-18808-Ljbffr