Salary: £78,000 - 91,000 per year Requirements: 5 years of hands-on experience writing scalable, production-grade PySpark/Spark SQL Strong proficiency in AWS Data Stack, including EMR, Glue, S3, Athena, and Glue Workflows Solid foundation in SAS for understanding and debugging legacy logic Expertise in ETL/ELT, dimensions, facts, SCDs, and data mart architecture Experience with parameterisation, exception handling, and modular Python design Responsibilities: Lead the end-to-end migration of SAS code (Base SAS, Macros, DI Studio) to PySpark using automated tools (SAS2PY) and manual refactoring Design, build, and troubleshoot complex ETL/ELT workflows and data marts on AWS Optimise Spark workloads for execution efficiency, partitioning, and cost-effectiveness Implement clean coding principles, modular design, and robust unit/comparative testing to ensure data accuracy Maintain Git-based workflows, CI/CD integration, and comprehensive technical documentation Technologies: AWS CI/CD Cloud ETL Git Python PySpark SAS SQL Spark DevOps More: We are seeking a Lead PySpark Engineer to drive a large-scale data modernisation project, transitioning legacy data workflows into a high-performance AWS cloud environment. This is a hands-on technical role focused on converting legacy SAS code into production-ready PySpark pipelines within a complex financial services landscape. The role is fully remote, offering 33 days of holiday entitlement (pro-rata), and is a great opportunity to collaborate internally with our team. last updated 11 week of 2026