Role: Cloud Data Loading Architect (GCP and BigQuery)
Location: Halifax or Leeds (Hybrid)
Job Type: Contract
Role Summary
* We are seeking an experienced Cloud Data Loading Architect to design, build, and optimise automated pipelines that ingest structured, semi‑structured, and unstructured datasets into Google Cloud Platform (GCP), specifically BigQuery.
* This role will lead end‑to‑end data ingestion design—from source discovery and schema mapping, through transformation and data quality, to scalable, secure loads into cloud‑native analytical warehouses.
* The ideal candidate combines strong cloud engineering skills with hands‑on data integration experience and a deep understanding of BigQuery performance optimisation.
Key Responsibilities
* Design and implement high‑throughput, fault‑tolerant ingestion pipelines for batch and streaming data landing in BigQuery, using Dataflow, Dataproc, Composer (Airflow), Pub/Sub, BigQuery Storage Write API, and related services.
* Define data loading frameworks, mapping rules, schema evolution strategy, and metadata management.
* Create reusable ingestion blueprints that ensure governance, lineage, and auditability.
* Establish data quality checks, validation rules, reconciliation logic, and SLAs.
* Optimize BigQuery cost, storage, partitioning, clustering, and access patterns.
* Collaborate with security & platform teams to ensure IAM, service accounts, VPCSC, and encryption policies are fully applied. Use GitHub, GitLab or Jenkins for CI/CD.
* Produce detailed technical documentation and coach engineering squads.
* Troubleshoot ingestion failures, performance bottlenecks, and cross‑platform data integration issues.
Qualifications
1. Deep Expertise in Google Cloud Data Services: BigQuery, GCS, Dataflow (Apache Beam), Pub/Sub, Dataproc, Cloud Composer, Storage Write API.
2. Data Ingestion Engineering Mastery: Hands‑on experience designing frameworks to load data from APIs, files, databases, event streams, and mainframe/legacy systems into cloud stores.
3. Strong SQL & BigQuery Optimisation skills: Partitioning, clustering, materialised views, cost‑efficient query design, columnar processing. Experience building transformation pipelines using Airflow, Dataflow, dbt, or equivalent orchestration tools. Ability to work with Parquet, Avro, ORC, JSON, CSV, nested/repeated structures, and schema evolution.
4. Strong Python and/or Java skills used to build Dataflow pipelines, ingestion utilities, automation scripts.
5. Cloud Security & Governance awareness: IAM roles, least‑privilege models, VPCSC, service accounts, artifact signing, audit. Cloud Build, GitHub Actions, Terraform, Cloud Deployment Manager or Pulumi.
6. Data Quality & Observability mindset: Experience implementing validation frameworks, anomaly detection, reconciliation rules, logging/monitoring (e.g., Cloud Logging, Cloud Monitoring).
7. Excellent Architectural Communication Skills: Ability to document, diagram, and communicate ingestion patterns to stakeholders at technical and non‑technical levels.
#J-18808-Ljbffr