London, south east england, United Kingdom Company: JR UK Client / Employer: Whitehall Resources
Posted: 29.05.2026
Job reference: 58883502338211840037341
What you’ll do
* Engineer production grade data pipelines on AWS (EMR, S3, Lambda), using PySpark/Python and SQL, with a focus on performance, resilience, testing, and observability.
* Migrate and modernise legacy workloads (e.g., ETL jobs and reporting feeds) onto cloud native services, creating reusable components and shared frameworks.
* Support reporting & MI use cases, including transformations and data models that feed downstream tools (e.g., Power BI).
* Own CI/CD and version control practices (e.g., Git/GitLab), review code, and enforce engineering standards.
* Coach and mentor engineers, provide technical guidance/code reviews, and contribute to architectural decisions across squads.
* Work in Agile delivery, collaborating across product, data, and platform teams using Jira/Confluence; translate requirements into robust engineering tasks.
* Embed security and compliance by design, aligning with BPSS/SC constraints and department data handling policies.
Essential skills & experience
* Hands on expertise in AWS & Spark: Amazon EMR, S3, Lambda; strong PySpark/Python and SQL for large scale batch processing.
* Data engineering at scale in government or similarly complex domains, including performance tuning and data quality management.
* CI/CD & DevOps: pipelines and IaC (e.g., Terraform), automated testing, and release governance.
* Version control & collaboration: Git/GitLab, code review, branching strategies, and trunk/PR workflows.
* APIs & integration: building/consuming data services to move and expose data safely and reliably.
* Agile ways of working with Jira/Confluence; clear stakeholder communication and concise technical documentation.
* Security clearance: BPSS (minimum) and SC cleared or SC clearable for UK government work.
* Data warehousing & modeling (e.g., Redshift; dimensional modeling; dbt).
* Basic Power BI familiarity to partner with BI developers and validate end to end data flows.
* AWS ecosystem depth (Athena, Redshift, EC2, CloudWatch, IAM) and event driven patterns.
Certifications (nice to have)
* AWS Certified Cloud Practitioner (or higher), Azure AI Fundamentals (awareness of ML/AI services).
Autonomy
* Works under general direction; plans own work; designs and implements PySpark jobs on EMR, modernising legacy code with minimal supervision.
Influence
* Shapes standards through code reviews and mentoring; influences delivery outcomes across teams.
Complexity
* Handles substantial, multifaceted engineering tasks (e.g., migration to new MI platform; data quality resolution; estimating effort).
Business skills
* Communicates effectively with stakeholders; aligns data products to reporting/decision making needs; contributes to Agile ceremonies.
#J-18808-Ljbffr