Senior Data Platform Engineer
Manchester, Cheltenham or London
We are seeking a skilled Data Engineer to join our Engineering team, responsible for designing, building, and optimising scalable data pipelines that power advanced analytics and machine learning solutions.
You will play a key role in enabling data-driven decision-making by delivering high-quality, reliable datasets to tools such as Amazon SageMaker and other analytics platforms.
We are looking for a Data Engineer to work closely with the Data Science team to develop robust data pipelines that feed analytics and machine learning tools such as Amazon SageMaker and third-party platforms like Databricks.
You will leverage AWS technologies such as EMR, S3, EKS and Airflow to process and orchestrate high-volume datasets, ensuring solutions are scalable, resilient and cost-efficient.
You will also play a key role in embedding data loss prevention (DLP) principles and controls into data pipelines to protect sensitive information, while ensuring data is reliable, accessible, well-governed and optimised for downstream consumption.
What we are looking for in you: Essential
* Strong experience in data engineering within AWS cloud environments.
* Hands-on experience with AWS big data technologies such as EMR, S3 and SageMaker.
* Proficiency in Python for building scalable data pipelines and processing frameworks.
* Experience with Apache Spark for distributed data processing.
* Experience designing and maintaining scalable batch and real-time data pipelines.
* Solid understanding of ETL/ELT design patterns and data modelling techniques.
* Experience with workflow orchestration tools such as Apache Airflow (ideally deployed on AWS).
* Familiarity with containerisation and orchestration using Docker and Kubernetes (EKS).
* Experience with infrastructure as code (e.g. Terraform) and CI/CD/GitOps practices.
* Proven ability to optimise performance and reduce cloud costs through partitioning, clustering and workload management.
* Understanding of data security principles, including data loss prevention (DLP).
Desirable
* Experience with Databricks or similar third-party big data platforms.
* Knowledge of real-time streaming technologies (e.g. Kafka, Kinesis).
* Experience implementing data governance and compliance frameworks.
* Familiarity with monitoring and observability tools in AWS environments.
* Exposure to Lakehouse or modern data platform architectures.
Jobs are provided by the Find a Job Service from the Department for Work and Pensions (DWP).
#J-18808-Ljbffr