Location: Sheffield | Reportingto: VP ofEngineering | Full-time
We are looking for a hands-on Lead Data Engineer to guide a small data team building an event‑driven industrial data platform. The role suits a natural technical leader: someone who enjoys writing production code, shaping data models and pipelines, mentoring engineers, and bringing real‑world experience to messy, high‑volume industrial data.
The opportunity
We are an intelligent asset management company specialising in industrial IoT data.
We are building an industrial data platform that ingests data from factory systems and turns it into structured, reliable data for analytics, machine learning, and AI‑driven applications. The platform is moving toward a Unified Namespace‑style architecture, where industrial events, asset metadata, and derived data are organised around a consistent model of customers, sites, assets, components, and measurements.
This is a hands‑on technical leadership role. You will lead a small data team while actively designing and building the pipelines, schemas, and data models that turn messy industrial data into reliable, queryable, AI‑ready data products.
You will use your real‑world engineering experience to help the team implement an existing data strategy, improve how the platform evolves, and make sure our data estate is robust enough for traditional analytics, data science, and emerging AI‑enabled workflows.
What you will work on
* Lead a small data team to implement an event‑driven, high‑volume data ingestion and normalisation strategy.
* Break the data strategy into deliverable engineering work and help the team execute it well.
* Design and build pipelines that ingest data from industrial systems, including edge devices, Ignition Edge, PLC‑connected systems, MQTT brokers, and similar sources.
* Design and implement a landing zone that standardises incoming data into well‑defined schemas.
* Apply Unified Namespace principles across the data estate, including consistent topic structures, asset context, schemas, metadata, and event‑driven processing patterns.
* Build and evolve pipelines across central broker systems, cold‑path storage for historical analysis, and hot‑path SQL/time‑series stores for real‑time access.
* Define and enforce data contracts, schemas, validation rules, and data quality checks.
* Process and structure event streams in near real time, handling issues such as ordering, duplication, late data, and schema drift.
* Model physical systems such as organisations, sites, assets, components, measurements, and relationships in a consistent and queryable way.
* Review designs and pull requests, coach engineers, and raise the standard of production data engineering practices.
* Enable downstream usage including analytics, machine learning pipelines, feature generation, and structured access patterns for AI and LLM‑based systems.
Technology environment
Our platform is service‑oriented and event‑driven, combining MQTT brokers, Azure services, data pipelines, SQL and time‑series storage, data lake storage, internal tools, data science workflows, and AI‑enabled engineering practices.
* Cloud: Azure, using core Azure services alongside custom workflows.
* Ingestion: MQTT and event‑driven pipelines.
* Processing: dbt and Databricks, with Databricks used by the Data Science team.
* Storage: Data lake storage, SQL databases, Postgres, and TimescaleDB.
* ML enablement: MLflow and downstream machine learning workflows.
* Visualisation: Grafana and internal tools.
You do not need to be a Databricks specialist or an AI specialist, but you should be comfortable working in a modern cloud data platform, learning new tools quickly, and building systems reliable enough to support analytics, machine learning, and AI‑driven workflows.
What we are looking for
We are looking for someone with strong fundamentals, practical delivery experience, and the judgement to lead a small team through complex data platform work.
* Experience working with event‑driven, streaming, or message‑based systems such as MQTT, Kafka, Kinesis, Azure Event Hubs, or similar.
* Strong engineering practices: git, CI/CD, automated testing, code review, and operational ownership.
* Ability to provide technical leadership to a small team while remaining hands‑on.
* Experience building and operating production‑grade data pipelines, batch and/or streaming.
* Strong SQL and data transformation skills, ideally using Python, Java, or similar production languages.
* Understanding of distributed data system challenges such as ordering, duplication, late data, back pressure, observability, and schema drift.
* Experience designing schemas for messy or inconsistent data sources.
* Good understanding of data contracts, validation rules, data quality, and maintainable transformation logic.
* Experience working in a cloud‑based data platform.
* Comfortable collaborating with engineering, data science, product, and senior technical stakeholders.
Highly desirable experience
* Experience with MQTT and industrial edge systems such as Ignition Edge or PLC‑connected environments.
* Understanding of time‑series data, sensor continuity, missing data, duplicated readings, spiky load, and variable schemas.
* Experience modelling physical assets, hierarchies, metadata, or graph‑like relationships.
* Familiarity with Unified Namespace concepts and ISA‑95‑style structuring patterns.
* Azure experience, particularly around data storage, eventing, compute, and operational services.
* Experience with Postgres, TimescaleDB, data lakes, dbt, Databricks, MLflow, or similar tools.
* Experience preparing data for downstream analytics, feature engineering, training datasets, or machine learning pipelines.
* Interest in AI‑assisted engineering, copilots, agents, LLM‑based workflows, MCP‑style access patterns, and AI‑consumable data platforms.
Industry context
Industrial or IoT experience is highly desirable. The role involves the physical reality of machines, sensors, connectivity issues, inconsistent source systems, time‑series continuity, and asset hierarchies.
However, we are also interested in candidates from other high‑volume or event‑driven domains where similar problems exist, such as duplicated or missing data, spiky load, schema evolution, messy source systems, and production‑critical data pipelines.
* Autonomy: You will have leadership support to make architectural decisions and improve the platform. You are not being hired to maintain the status quo; you are being hired to help fix and evolve it.
* Impact: Your work will directly help the Data Science team spend less time cleaning data and more time building better predictive models and customer‑facing insight.
* Modern stack: You will work on an event‑driven industrial data platform using modern patterns including Unified Namespace, canonical asset modelling, time‑series data, and AI‑ready access patterns.
* Technical leadership: You will influence how a small, capable data team delivers, while still staying close to the code and architecture.
What you will get
* Tax‑efficient stock options.
* Salary sacrifice EV scheme.
* Training and professional development opportunities.
* Regular all‑hands meetings for recognition, inspiration, and transparent communication.
* Hybrid working approach, with 2-3 days per week in person.
* Quarterly employee award scheme.
* Discounted purchases through the company HR platform.
Join the team
* If your background does not exactly align with every part of the job description, but you have transferable skills or experience that could be a strong match, we encourage you to highlight this in a cover letter. We are committed to personal growth as the company evolves, so if you are excited to be part of that journey, we would love to hear from you.
#J-18808-Ljbffr