Salary: £85,000 - 85,000 per year Requirements: Experience working with large-scale datasets (multi-terabyte minimum) Strong background in data engineering within distributed systems Proven ability to build and optimise data pipelines at scale Solid Python skills with exposure to tools such as Spark, PySpark, and Hadoop Strong understanding of data ingestion and batch processing methodologies Experience with text or unstructured data is advantageous but not essential A hands-on, problem-solving mindset with a focus on engineering quality Responsibilities: Work with multi-terabyte to large-scale datasets (100TB), primarily text-based Build and optimise data ingestion and processing pipelines Improve batch processing performance and throughput across large datasets Solve challenges related to scalability, reliability, and data flow Contribute to the development of distributed data platforms Collaborate within a highly technical, research-driven environment Technologies: AI Flow Hadoop Python PySpark Spark Support Security More: We are a specialist technology consultancy delivering advanced data and AI solutions within highly secure government environments. This role offers a unique opportunity to tackle complex data problems in a calm, focused, and highly technical environment, emphasizing autonomy and trust. We offer a competitive salary of £80,000 - £85,000, approximately 10% bonus, and pension contributions, along with strong benefits and clear opportunities for professional growth. The position is hybrid, requiring only one day per week on-site. last updated 18 week of 2026