Job Description
Role Summary
We are looking for an experienced Cloud Data Loading Architect to design, build, and optimize scalable data ingestion pipelines on Google Cloud Platform (GCP), with a strong focus on BigQuery.
The ideal candidate will lead end-to-end data ingestion architecture—from source discovery and schema mapping to transformation, validation, and high-performance loading into cloud-native data warehouses. This role requires a combination of strong cloud engineering expertise, hands-on data integration experience, and deep knowledge of BigQuery performance optimization.
Key Responsibilities
* Design and implement high-throughput, fault-tolerant ingestion pipelines for both batch and streaming data into BigQuery.
* Architect data ingestion solutions using GCS, Dataflow (Apache Beam), Pub/Sub, Dataproc, Cloud Composer (Airflow), and BigQuery Storage Write API.
* Define data loading frameworks, schema evolution strategies, mapping rules, and metadata management processes.
* Develop reusable ingestion patterns ensuring data governance, lineage, and auditability.
* Implement data quality checks, validation rules, reconciliation logic, and SLA monitoring.
* Optimize BigQuery performance and cost through partitioning, clustering, and efficient query design.
* Collaborate with security teams to enforce IAM, VPC Service Controls (VPC-SC), encryption, and access policies.
* Automate deployment pipelines using CI/CD tools such as Cloud Build, GitHub Actions, GitLab, or Jenkins.
* Provide technical documentation and mentor engineering teams on best practices.
* Troubleshoot ingestion issues, performance bottlenecks, and cross-platform integration challenges.
Required Skills & Qualifications
Core Technical Skills
* Strong expertise in Google Cloud Data Services:
* BigQuery, GCS, Dataflow, Pub/Sub, Dataproc, Cloud Composer
* Proven experience in data ingestion and integration frameworks
* Advanced SQL and BigQuery optimization skills (partitioning, clustering, cost optimization)
* Hands-on experience with ETL/ELT tools (Airflow, Dataflow, dbt, etc.)
* Proficiency in Python and/or Java for pipeline devel