Responsibilities
of the role:
1. Data Pipeline Development: Build and maintain batch and streaming pipelines using Azure Data Factory and Azure Databricks.
2. Data Categorisation & Enrichment: Structure unprocessed datasets through tagging, standardisation, and feature engineering.
3. Automation & Scripting: Use Python to automate ingestion, transformation, and validation processes.
4. ML Readiness: Work closely with data scientists to shape training datasets, applying sound feature selection techniques.
5. Data Validation & Quality Assurance: Ensure accuracy and consistency across data pipelines with structured QA checks.
6. Collaboration: Partner with analysts, product teams, and engineering stakeholders to deliver usable and trusted data products.
7. Documentation & Stewardship: Document processes clearly and contribute to internal knowledge sharing and dataernance.
8. Platform Scaling: Monitor and tune infrastructure for cost-efficiency, performance, and reliability as data volumes grow.
9. On-Call support: Participate in an on-call rota system to provide support for the production environment, ensuring timely resolution of incidents and maintaining system stability outside of standard working hours.
Requirements
What you will need:
The ideal candidate will be proactive and willing to develop and implement innovative solutions, capable of the following:
Rmended:
10. 2+ years of professional experience in a data engineering or similar role.
11. Proficiency in Python, including use of libraries for data processing (, pandas, pySpark).
12. Experience working with Azure-based data services, particularly Azure Databricks, Data Factory, and Blob Storage.
13. Demonstrable knowledge of data pipeline orchestration and optimisation.
14. Understanding of SQL for data extraction and transformation.
15. Familiarity with source control, deployment workflows, and working in Agile teams.
16. Strongmunication and documentation skills, including translating technical work to non-technical stakeholders.
Preferred:
17. Exposure to machine learning workflows or model preparation tasks.
18. Experience working in a financial, payments, or regulated data environment.
19. Understanding of monitoring tools and logging best practices (, Azure Monitor, Log Analytics).
20. Awareness of cost optimisation and scalable design patterns in the cloud.
Job ID 788405236A