Department: Cyber Services and Capabilities
Location: GBR Manchester Hardman Boulevard
Description
Manchester, Cheltenham or London
We are seeking a skilled Data Engineer to join our Engineering team, responsible for designing, building, and optimising scalable data pipelines that power advanced analytics and machine learning solutions. You will play a key role in enabling data-driven decision-making by delivering high-quality, reliable datasets to tools such as Amazon SageMaker and other analytics platforms.
Key Responsibilities
We are looking for a Data Engineer to work closely with the Data Science team to develop robust data pipelines that feed analytics and machine learning tools such as Amazon SageMaker and third-party platforms like Databricks. You will leverage AWS technologies such as EMR, S3, EKS and Airflow to process and orchestrate high-volume datasets, ensuring solutions are scalable, resilient and cost-efficient. You will also play a key role in embedding data loss prevention (DLP) principles and controls into data pipelines to protect sensitive information, while ensuring data is reliable, accessible, well-governed and optimised for downstream consumption.
Skills, Knowledge & Expertise
Essential
* Strong experience in data engineering within AWS cloud environments.
* Hands-on experience with AWS big data technologies such as EMR, S3 and SageMaker.
* Proficiency in Python for building scalable data pipelines and processing frameworks. Experience with Apache Spark for distributed data processing.
* Experience designing and maintaining scalable batch and real-time data pipelines.
* Solid understanding of ETL/ELT design patterns and data modelling techniques.
* Experience with workflow orchestration tools such as Apache Airflow (ideally deployed on AWS).
* Familiarity with containerisation and orchestration using Docker and Kubernetes (EKS).
* Experience with infrastructure as code (e.g. Terraform) and CI/CD/GitOps practices.
* Proven ability to optimise performance and reduce cloud costs through partitioning, clustering and workload management.
* Understanding of data security principles, including data loss prevention (DLP).
Desirable
* Experience with Databricks or similar third-party big data platforms.
* Knowledge of real-time streaming technologies (e.g. Kafka, Kinesis).
* Experience implementing data governance and compliance frameworks.
* Familiarity with monitoring and observability tools in AWS environments.
* Exposure to Lakehouse or modern data platform architectures.
Job Benefits
* Flexible Working: Balance your work and personal life with our flexible working options.
* Generous Holiday Allowance: Enjoy 25 days of holiday, plus bank holidays, with the option to buy up to 5 additional days of annual leave.
* Medicash & Critical Illness Scheme
* Financial & Investment Benefits: Enjoy peace of mind with our Pension, Life Assurance, and Share Save Scheme.
* Community & Volunteering Programmes: Make a difference in your community with our volunteering opportunities.
* Green Car Scheme: Drive green and save money with our eco-friendly car scheme.
* Cycle Scheme: Stay fit and healthy with our cycle-to-work scheme.
* Special Time Off: Take time off for those big moments in life, like getting married/entering into a civil partnership, becoming a grandparent, and welcoming home a new pet.
* Family Planning: Benefit from our generous maternity and paternity leave, as well as time off and support for those undergoing fertility treatments.
#J-18808-Ljbffr