* Drive the design, development, and optimisation of robust data pipelines
* Build scalable and reliable data infrastructure to power advanced AI models.
About Our Client
A pivotal and leading government entity, dedicated to leveraging advanced technology and data to serve national interests and drive strategic objectives
Job Description
* Design, build, and optimise scalable, robust, and efficient data pipelines for ingesting, transforming, and preparing large and complex datasets specifically for AI/ML models, including structured and unstructured data sources.
* Develop and manage data infrastructure components in cloud environments (e.g., Azure, AWS, GCP) ensuring security, compliance, and performance for AI workloads.
* Implement data governance best practices, ensuring data quality, lineage, privacy, and security for sensitive government data used in AI applications.
* Collaborate closely with AI Scientists and Machine Learning Engineers to understand their data needs, optimising data formats, storage, and access patterns for model training and inference, particularly for LLMs.
* Develop and maintain data versioning, feature stores, and model-serving data layers to support the AI/ML lifecycle.
* Troubleshoot and resolve data-related issues, ensuring data reliability and integrity for AI projects.
The Successful Applicant
* Bachelor's degree or equivalent practical experience in Computer Science, Data Engineering, Software Engineering, or a related quantitative field.
* Minimum of 6+ years of progressive experience in data engineering, with at least 3 years specifically focused on building data pipelines and infrastructure for AI/ML projects.
* Demonstrable strong experience with data engineering concepts and tools for AI, including data preparation for techniques like LLMs, natural language processing, and deep learning.
* Expert proficiency in at least one programming language commonly used in data engineering (e.g., Python, Scala, Java).
* Strong experience with big data technologies (e.g., Spark, Hadoop, Flink) and distributed data processing frameworks.
* Proven experience with cloud data platforms and services (e.g., Azure Data Factory, Azure Databricks, AWS Glue, Google Cloud Dataflow, BigQuery).
What's On Offer
An exceptional opportunity to contribute to high-impact national initiatives within a leading government entity.
Contact: Manpreet Kaur
Quote job ref: JN-052025-6748002 #J-18808-Ljbffr