Job Description
Role: Cloud Data Loading Architect (GCP and BigQuery)
Location: Halifax or Leeds (Hybrid)
Job Type: Contract
Role Summary
* We are seeking an experienced Cloud Data Loading Architect to design, build, and optimise automated pipelines that ingest structured, semistructured, and unstructured datasets into Google Cloud Platform (GCP), specifically BigQuery.
* This role will lead endtoend data ingestion design—from source discovery and schema mapping, through transformation and data quality, to scalable, secure loads into cloudnative analytical warehouses.
* The ideal candidate combines strong cloud engineering skills with handson data integration experience and a deep understanding of BigQuery performance optimisation.
Key Responsibilities
· Design and implement highthroughput, faulttolerant ingestion pipelines for
batch and streaming data landing in BigQuery.
· Lead data ingestion architecture patterns using Cloud Storage (GCS),
Dataflow, Dataproc, Composer (Airflow), Pub/Sub, BigQuery Storage Write
API, and related services.
· Define data loading frameworks, mapping rules, schema evolution strategy,
and metadata management.
· Create reusable ingestion blueprints that ensure governance, lineage, and
auditability.
· Establish data quality checks, validation rules, reconciliation logic, and
SLAs.
· Optimise BigQuery cost, storage, partitioning, clustering, and access
patterns.
· Collaborate with security & platform teams to ensure IAM, service
accounts, VPCSC, and encryption policies are fully applied.
· Automate CI/CD deployments for ingestion pipelines using Cloud Build,
GitHub, GitLab or Jenkins.
· Produce detailed technical documentation and coach engineering squads
in cloud ingestion standards.
· Troubleshoot ingestion failures, performance bottlenecks, and
crossplatform data integration issues.
Top 10 Skillset & Qualities (Ideal Candidate)
1. Deep Expertise in Google Cloud Data Services
BigQuery, GCS, Dataflow (Apache Beam), Pub/Sub, Dataproc, Cloud Composer,
Storage Write API.
2. Data Ingestion Engineering Mastery
Handson experience designing frameworks to load data from APIs, files,
databases, event streams, and mainframe/legacy systems into cloud stores.
3. Strong SQL & BigQuery Optimisation Skills
Partitioning, clustering, materialised views, costefficient query design, columnar
storage understanding.
4. ETL/ELT Architecture Knowledge
Experience building transformation pipelines using Airflow, Dataflow, dbt, or
equivalent orchestration tools.
5. File & Format Proficiency
Ability to work with Parquet, Avro, ORC, JSON, CSV, nested/repeated structures,
and schema evolution.
6. Strong Python and/or Java Skills
Used to build Dataflow pipelines, ingestion utilities, automation scripts.
7. Cloud Security & Governance Awareness
IAM roles, leastprivilege models, VPCSC, service accounts, artifact signing, audit
logging.
8. DevOps & CI/CD Familiarity
Cloud Build, GitHub Actions, Terraform, Cloud Deployment Manager or Pulumi.
9. Data Quality & Observability Mindset
Experience implementing validation frameworks, anomaly detection,
reconciliation rules, logging/monitoring (e.g., Cloud Logging, Cloud Monitoring).
10. Excellent Architectural Communication Skills
Ability to document, diagram, and communicate ingestion patterns to
stakeholders at technical and nontechnical levels.