Salary: £36,500 - 64,500 per year Requirements: Advanced proficiency in Python and PySpark, writing clean, modular, object-oriented code for data transformations Strong command of SQL (T-SQL, Spark SQL) for data exploration, validation, and final-stage modelling Deep hands-on experience with Microsoft Fabric and its tooling such as Azure Data Factory (ADF), and Azure Data Lake Storage (ADLS Gen2) Practical experience with Git, branching strategies, automated testing (e.g. pytest), and CI/CD orchestration via Azure DevOps Proven commercial track record of deploying complex data solutions on the Microsoft Azure platform Experience collaborating with a range of stakeholders to structure data for downstream consumption (e.g. MLflow, Power BI semantic models) Infrastructure-as-code experience with Terraform for Azure resource provisioning Familiarity with streaming data architectures (Spark Structured Streaming) Knowledge of complementary modern data stack tools such as dbt for SQL-based transformations Experience integrating Large Language Models (LLMs) or operationalising AI/ML models Exceptional problem-solving abilities and a persistent, detail-oriented approach to debugging complex code Strong communication skills to effectively translate business requirements into technical architectures A proactive mindset focused on continuous learning and staying ahead of the rapidly evolving data landscape Willingness to review code submissions, enforce coding standards, and mentor junior engineers on the team 3–5 years in software engineering, data engineering, or Big Data environments with a code-first approach Proven commercial experience deploying and maintaining complex data solutions on Microsoft Azure Experience working in cross-functional teams Responsibilities: Architect and write production-grade ELT/ETL data pipelines using PySpark and Python within Azure ecosystem. Build custom, reusable data processing frameworks and libraries in Python/Scala to streamline ingestion and transformation tasks across the engineering team Programmatically ingest large volumes of structured and unstructured data from REST APIs, streaming platforms (e.g. Event Hubs, Kafka), and legacy databases into ADLS Gen2 and OneLake Develop structured data models aligned to Lakehouse, Medallion Architecture, and Delta Lake patterns Continuously profile, debug, and optimise Spark jobs, SQL queries, and Python scripts for maximum performance and cost-efficiency at scale Champion DevOps best practices: implement infrastructure-as-code (Terraform), automated testing, and CI/CD deployment pipelines via Git and Azure DevOps Identify patterns in recurring issues and engineer permanent solutions Write comprehensive unit and integration tests for all data pipelines to ensure data integrity; enforce data governance protocols, RBAC, and encryption standards across all environments Technologies: AI Architect Azure Big Data CI/CD DevOps ETL Fabric Git Kafka Power BI Python PySpark REST RBAC SQL Scala Spark Terraform dbt pytest Cloud Business Intelligence More: At Synextra, we are a Microsoft-specialist Managed Service Provider headquartered in Warrington, working with regulated mid-market organisations like law firms, financial services firms, and mortgage lenders. We are a deliberately small team of around 35, believing that technical depth yields better outcomes than a large headcount. Our fast-growing AI Services Division is aimed at expanding our data and engineering capabilities, giving you a chance to shape how this function operates. We offer a collaborative and dynamic work environment with opportunities for growth, mentorship, and the chance to work at the forefront of data engineering. last updated 12 week of 2026