DataOps / MLOps Engineer (Databricks | Azure)
Location: London, Greater London, United Kingdom
Salary: €6000 – 8500
Hours: 32 – 40 hours per week
Domain: Tech & Digital
Overview
Build and operate the foundation of a global AI‑powered data platform. You will take ownership of the reliability, scalability, and observability of the Headless Data Architecture (HDA), ensuring that data and AI pipelines run seamlessly across regions and environments.
Your impact
In this role you ensure the underlying data and AI infrastructure is stable, scalable, and production‑ready across Europe, the UK, the US, and APAC. You monitor pipelines, resolve incidents, drive automation, infrastructure‑as‑code, and cost optimisation, and support AI/ML workloads at scale.
What you will do
* Operate and maintain the end‑to‑end data and AI platform on Databricks and Azure
* Monitor, troubleshoot, and resolve production issues across data and ML pipelines
* Manage Unity Catalog governance, access control, and data sharing structures
* Build and maintain data ingestion and integration pipelines (e.g., using SnapLogic)
* Implement observability frameworks with tools such as OpenTelemetry and Grafana
* Automate infrastructure provisioning using Infrastructure‑as‑Code (e.g., Terraform)
* Optimize compute usage, scaling policies, and overall platform cost efficiency
* Support AI/ML evaluation frameworks and model validation pipelines
* Manage the lifecycle, versioning, and deployment of prompts across environments
* Implement AI guardrails, safety layers, and monitoring for production systems
* Contribute to internal tooling and platform acceleration initiatives
* Collaborate across data, platform, and AI teams to continuously improve the platform
About the role
You will be part of a growing Data & AI division that plays a central role in the transformation of HeadFirst Group x Impellam into a truly data‑driven organization. The role ensures that everything built on the platform—analytics to AI products—is reliable, secure, and scalable.
Qualifications
* 3–5+ years of experience in platform engineering, DataOps, or MLOps
* Extensive experience with Databricks (cluster management, jobs, Unity Catalog, Delta Lake)
* Experience with Azure (ADLS, networking, identity, cost management)
* Experience with integration platforms such as SnapLogic or similar iPaaS tools
* Solid experience with Infrastructure‑as‑Code (Terraform or equivalent)
* Strong knowledge of observability tools (OpenTelemetry, Grafana, or similar)
* Experience with CI/CD pipelines and automated deployments
* Strong Python and/or Scala skills for automation and pipeline scripting
* Understanding of data quality frameworks and monitoring
* Experience with AI/ML platform concepts such as model evaluation, deployment, or safety measures
Professional attributes
* Reliability‑first mindset—prioritise uptime, stability, and data quality
* Strong ownership—manage incidents from detection to resolution
* Collaborative approach involving data, platform, and AI teams
* Pragmatic thinking that balances performance, cost, and scalability
* Clear communication—able to explain platform risks and status to stakeholders
#J-18808-Ljbffr