Jobs
My ads
My job alerts
Sign in
Find a job Career Tips Companies
Find

Coe lead - observability & tooling

Bury
TN United Kingdom
€80,000 - €100,000 a year
Posted: 8 May
Offer description

Social network you want to login/join with:


COE Lead - Observability & Tooling, Bury

col-narrow-left


Client:

JD Group


Location:

Bury, United Kingdom


Job Category:

Other

-


EU work permit required:

Yes

col-narrow-right


Job Reference:

2900839ac228


Job Views:

6


Posted:

05.05.2025


Expiry Date:

19.06.2025

col-wide


Job Description:

The CoE Lead - Observability & Tools at JD Sports Fashion Plc is a critical, hands-on technical role focused on designing, building, and maintaining the company's Observability platform. The role ensures that our technology platforms operate efficiently and reliably, providing early insights for Engineering, Service Reliability, Service Delivery, and DevOps teams.

The CoE Lead will manage the contract with third-party providers responsible for the execution layer, ensuring adherence to service-level agreements (SLAs) and key performance indicators (KPIs). The position involves a 75% focus on the design of frameworks and a 25% focus on implementation and adoption.

Job Title – Centre Of Excellence Lead- Observability & Tooling

Working hours – 40

What You'll Be Doing:

We are looking for an experienced CoE Lead to design, build, and maintain our Observability platform. The CoE Lead will work closely with DevOps, Engineering, Service Reliability, and Service Delivery teams to continuously improve our Observability capabilities.

This role is a technical, hands-on position with a 75% focus on framework design and 25% on implementation and adoption.

You will contribute to pipeline design, enabling observability from the first deployment in test environments and providing early insights for Engineering, Service Reliability, Service Delivery, and DevOps teams. The role involves building frameworks for intelligent alerts to help Service Delivery teams quickly triage incidents and enable automated runbooks. Additionally, you will identify and deploy tools to automate incident detection, notifications, triage, and resolution.

Key Responsibilities:

* Pipeline Approach: Adopt a pipeline approach to enable observability of services deployed across multiple environments, balancing monitoring, logging, and tracing based on service classification.
* Intelligent Alerts: Design and build intelligent alerts using pipelines, onboarding automated runbooks triggered with clear audit/logs in service management tools like Jira Service Management.
* Dashboards: Create and maintain dashboards for proactive monitoring of services to help teams resolve incidents quickly.
* Monitoring Capability: Continuously improve monitoring capabilities to identify key alerts and thresholds for early warnings before services fail.
* Automation: Enable intelligent alerts with fine-grained details of underlying services causing issues, extending to trigger automated execution of runbooks with clear audit logs.
* Collaboration: Work closely with DevOps, Service Reliability, and Service Delivery teams to identify and deploy tools that automate incident detection, notifications, triage, and resolution.

What We're Looking For:

Skills:

* Leadership and Collaboration: Strong leadership skills with the ability to mentor, coach, and develop high-performing teams.
* Excellent communication and interpersonal skills, capable of building strong relationships with both technical and business stakeholders.
* Proven ability to collaborate effectively with cross-functional teams, including DevOps, Engineering, Service Reliability, and Service Delivery teams.
* Technical Expertise: In-depth knowledge of open-source and commercial observability tools (e.g., Prometheus, Grafana, NewRelic).
* Expertise in cloud environments (e.g., AWS, Azure) and infrastructure as code (IaC) tools like Terraform.
* Monitoring and Observability: Experience in creating and maintaining dashboards for proactive monitoring of services.
* Ability to design and build intelligent alerts using pipelines, enabling early detection of issues and automated incident response.
* Knowledge of the latest technology trends in the monitoring landscape, such as OpenTelemetry.
* Contract Management: Experience in managing third-party provider contracts, including negotiating terms, monitoring performance, and ensuring adherence to SLAs and KPIs.
* Ability to integrate third-party providers seamlessly into the organisation's workflows, aligning with the overall strategic vision.

Experience:

* Professional Experience: Minimum of 5-8 years of experience in technology service delivery and management, focusing on observability, monitoring, and tooling.
* Service Management: Practical experience in building and maintaining a Service Catalogue, assigning service level objectives (SLOs), and measuring service level indicators (SLIs).
* Experience in operating production services during peak trading periods without service degradation.
* Automation and Tooling: Knowledge of automation tools to simplify alert notifications and extend to automated runbook execution.
* Experience in implementing observability solutions for retail stores or similar environments.

Proven experience in overseeing and managing Atlassian tools for effective tracking, collaboration, and service management

#J-18808-Ljbffr

Apply
Create E-mail Alert
Job alert activated
Saved
Save
See more jobs
Similar jobs
jobs Bury
jobs Greater Manchester
jobs England
Home > Jobs > COE Lead - Observability & Tooling

About Jobijoba

  • Career Advice
  • Company Reviews

Search for jobs

  • Jobs by Job Title
  • Jobs by Industry
  • Jobs by Company
  • Jobs by Location
  • Jobs by Keywords

Contact / Partnership

  • Contact
  • Publish your job offers on Jobijoba

Legal notice - Terms of Service - Privacy Policy - Manage my cookies

© 2025 Jobijoba - All Rights Reserved

Apply
Create E-mail Alert
Job alert activated
Saved
Save