Location : Milton Keynes, hybrid
Type of employmnbet : Contract or Permanent
Lead end-to-end data workflows—from requirement-to-delivery—including data product creation and secure data transfer from Google Cloud Platform to PostgreSQL.
1. Develop & Schedule SQL Views via DAGs
Design and implement SQL views aligned with business needs, prioritizing clarity, reusability, and efficiency.
Build and manage workflow orchestrations (e.g., Airflow DAGs) to automate those views, ensuring reliable execution on daily, weekly, or customized schedules.
2. Execute Cross‑Platform ETL with AWS Glue
Develop, deploy, and maintain AWS Glue jobs to extract data from GCP (such as BigQuery or GCS) and load it into PostgreSQL.
Set up secure connectivity, schedule jobs via cron or trigger mechanisms, and ensure data pipelines are reliable and idempotent.
3. Monitor, Troubleshoot & Resolve Incidents
Continuously oversee ETL workflows in Airflow and AWS Glue, proactively responding to alerts and errors.
Conduct root cause analysis for pipeline failures—whether due to schema mismatches or performance bottlenecks—and apply robust fixes. Document resolutions to strengthen system resilience.
4. Design, Build, & Govern Data Products
Architect, construct, and maintain reusable data products, embedding clean datasets, metadata, governance policies, and clearly defined data contracts.
Ensure compliance with FAIR principles—data being Findable, Accessible, Interoperable, and Reusable—and enforce robust access controls in collaboration with governance stakeholders.
5. Translate Requirements into Technical Designs
Accumulate and analyze requirements via stakeholder engagement, user stories, or use cases.
Convert these into detailed design artifacts, including architecture diagrams, data models, and specifications for development.
6. Optimize Performance Across the Stack
Continuously refine ETL pipelines, SQL logic, and data workflows to boost efficiency and scalability. Techniques may include indexing, partitioning, caching, or employing materialized views to improve query speed.
7. Lead Migration from hh360 to BigQuery
Architect and drive a seamless migration strategy to move data and pipelines from the legacy hh360 system into Google BigQuery.
Employ iterative migration patterns for safe data transfers, rigorous validation, and phased deprecation of legacy infrastructure.