Jobs
My ads
My job alerts
Sign in
Find a job Career Tips Companies
Find

Data engineer (sc cleared)

Manchester
scrumconnect ltd
Data engineer
€60,000 a year
Posted: 18 May
Offer description

Apache Spark Python AWS Cloud Data Pipelines

A hands-on data engineering role within a large-scale cloud data programme, responsible for building, maintaining, and troubleshooting data pipelines using Apache Spark, PySpark, Apache Airflow, and a broad suite of AWS services. You will apply strong analytical and engineering skills to deliver trusted, well-governed data assets in a modern, cloud-native environment.

Active SC clearance is a mandatory, non-negotiable requirement. Candidates must hold current, in-date Security Check (SC) clearance at the time of application. Sponsorship is not available. Applications without active SC clearance will not be considered.


Working arrangement

This role is hybrid. Candidates must be willing and able to travel to the London office three days per week. Remaining days may be worked remotely from anywhere in the UK.


About the role

You will work as a Data Engineer on a complex, cloud-based data programme – designing, building, and maintaining data pipelines that process large volumes of data across a modern AWS-native stack. Using Apache Spark and PySpark for distributed data processing, Apache Airflow for orchestration, and a range of AWS services for storage, compute, and analytics, you will help deliver reliable, well-governed data assets to downstream users.

You will apply strong data analysis skills to identify root causes of data issues, work with dimensional data models and slowly changing dimensions, and implement infrastructure as code using Terraform. Familiarity with DWP engineering best practices and the ability to translate customer expectations into applied technical functionality are key to success in this role.


Key responsibilities


Data pipeline development

Build and maintain scalable data pipelines using Apache Spark and PySpark, processing and transforming large datasets across distributed cloud infrastructure.


Workflow orchestration

Configure and manage Apache Airflow DAGs for task orchestration, ensuring reliable scheduling, monitoring, and execution of data processing workflows.


Root cause analysis

Perform data analysis to identify and resolve root causes of pipeline failures and data quality issues – including reviewing EMR output logs and CloudWatch metrics.


Data modelling

Apply understanding of dimensional data models and slowly changing dimensions (SCD) to design and maintain well-structured, analytically trusted data assets.


Infrastructure as code

Provision and manage cloud infrastructure using Terraform. Containerise solutions using Docker and manage deployments through GitLab CI/CD pipelines and release tagging.


Security & encryption

Apply understanding of both Server Side and client-side encryption patterns within AWS. Work within IAM policies and data governance standards appropriate to a regulated government environment.


Technical skills required


Languages & analytics

* Python – primary language for pipeline development and data processing
* SQL – used for querying, transformation, and validation across data stores
* PySpark – for distributed data processing using Apache Spark on AWS EMR
* Familiarity with basic data structures for constructing robust, scalable solutions


Data processing & orchestration

* Apache Spark – understanding of distributed data processing architecture and execution
* Apache Airflow – configuring DAGs and managing task orchestration at scale
* Jupyter Notebooks – for exploratory data analysis and pipeline prototyping
* Understanding of dimensional data models and slowly changing dimensions (SCD Types 1, 2, 3)
* Data analysis skills to identify root cause of issues within pipelines and data assets


AWS services

* Amazon EMR – running Spark workloads and reviewing output logs
* Amazon Athena – ad hoc querying of data in S3
* Amazon Textract and Comprehend – familiarity with AI/ML document extraction and NLP services
* AWS S3, IAM, CloudWatch, EC2, ECR – core platform services used day‑to‑day
* AWS console proficiency – navigating, configuring, and monitoring services
* Understanding of Server Side and client-side encryption within AWS


Infrastructure, DevOps & delivery

* Terraform – Infrastructure as Code for provisioning and managing AWS environments
* Docker – containerisation of data engineering solutions
* GitLab – source code management, CI/CD pipeline configuration, release tagging, and component versioning
* Familiarity with DWP engineering best practices
* Ability to translate customer expectations into applied, functional technical solutions


Technology stack at a glance

* Python
* PySpark
* SQL
* Apache Spark
* Apache Airflow
* Jupyter Notebooks
* Dimensional modelling/SCDAWS
* Amazon EMR
* Amazon Athena
* AWS S3
* AWS IAM
* AWS CloudWatch
* AWS EC2/ECR
* Amazon Textract
* Amazon Comprehend
* Terraform
* Docker
* GitLab CI/CD
* GitLab Tags
#J-18808-Ljbffr

Apply
Create E-mail Alert
Job alert activated
Saved
Save
Similar job
Senior data engineer (ai)
Manchester
Permanent
Data engineer
£80,000 a year
Similar job
Secure data engineer
Manchester
Capgemini
Data engineer
€65,000 a year
Similar job
Football data engineer
Manchester
Manchester United FC
Data engineer
€52,500 a year
See more jobs
Similar jobs
It jobs in Manchester
jobs Manchester
jobs Greater Manchester
jobs England
Home > Jobs > It jobs > Data engineer jobs > Data engineer jobs in Manchester > Data Engineer (SC Cleared)

About Jobijoba

  • Career Advice
  • Company Reviews

Search for jobs

  • Jobs by Job Title
  • Jobs by Industry
  • Jobs by Company
  • Jobs by Location
  • Jobs by Keywords

Contact / Partnership

  • Contact
  • Publish your job offers on Jobijoba

Legal notice - Terms of Service - Privacy Policy - Manage my cookies - Accessibility: Not compliant

© 2026 Jobijoba - All Rights Reserved

Apply
Create E-mail Alert
Job alert activated
Saved
Save