Context
Collective.work is building the next-generation AI-powered sourcing platform for recruiters. Our mission is to help talent teams identify, engage, and hire the best candidates faster through intelligent automation and data-driven insights. We operate at the intersection of data, AI, and recruiting workflows—where high-quality data infrastructure is critical to our success.
Missions
* Design and maintain scalable data pipelines (batch and real-time)
* Build and optimize ETL/ELT workflows across Azure and/or GCP
* Develop data models and architectures to support analytics and ML use cases
* Ensure data quality, integrity, and reliability across systems
* Collaborate with ML engineers to prepare and serve training datasets
* Monitor and improve pipeline performance, cost efficiency, and scalability
* Implement best practices for data governance, security, and compliance
* Contribute to tooling and infrastructure decisions
Tools & Environment
* Cloud: Azure (Data Factory, Synapse) and/or GCP (BigQuery, Dataflow)
* Data Processing: Python, SQL, Spark
* Orchestration: Airflow / Prefect
* Storage: Data lakes, warehouses
* Streaming: Kafka / PubSub (nice to have)
* DevOps: Docker, CI/CD
Working Conditions
* Flexible remote work environment
* Opportunity to work on a product at the cutting edge of AI and recruiting
* High ownership and impact from day one
* Collaborative, product-driven engineering culture
* Opportunity to shape the data foundation of a growing platform
J-18808-Ljbffr