Summary
This is a hands-on DataOps / Data Quality Engineer role with a strong focus on building data validation frameworks and automated testing for Azure-based data platforms. The role also includes DataOps responsibilities, ensuring reliable, observable, and well-governed pipeline operations across Fabric Data Factory, Azure Data Factory and Synapse environments. Additionally, the engineer will take on Data Reliability Engineering (SRE) responsibilities.
Key Responsibilities
* Build, maintain, or leverage open-source data validation frameworks to ensure data accuracy, schema integrity, and quality across ingestion and transformation pipelines
* Test and validate data pipelines and PySpark notebooks developed by Data Engineers, ensuring they meet quality, reliability, and validation standards
* Defining and standardizing monitoring, logging, alerting, and KPIs/SLAs across data platform to enable consistent measurement of data reliability.
* Identify and create Azure Monitor alert rules and develop KQL queries to extract metrics and logs from Azure Monitor/Log Analytics for reliability tracking and alerting.
* Write SQL queries and PowerShell (or another scripting language) to automate the execution of validation routines, verify pipeline outputs, and support end-to-end data quality workflows
* Collaborate with Data Engineering, Cloud, and Governance teams to embed standardized validation and reliability practices into their workflows
* Document validation rules, testing processes, operational guidelines, and data reliability best practices to ensure consistency across teams
What We’re Looking For
* Strong background in data validation frameworks, automated testing, data verification logic, and quality enforcement
* Automation Experience for data validations, reconciliations and generating alerts.
* Experience with Azure Monitor, setting up Alert rules, building dashboards using data queried (KQL) from Log Analytics.
* Experience with Fabric Data Factory, Azure Data Factory, Synapse pipelines, and PySpark notebooks
* Hands-on experience calling REST/OData APIs for validating data.
* Experience writing SQL and scripts for programmatically doing data validations and reconciliation across systems.
* Strong understanding of the Azure ecosystem, including identity, network security, storage, and authentication models
* Working experience with Azure DevOps and CI/CD
* Strong debugging, incident resolution, and system reliability skills aligned to SRE
* Ability to work independently while collaborating effectively across Data Engineering, Cloud, Analytics, and Governance teams
Beneficial Experience
* Experience in data space, with strong exposure to data testing, validations, and Data Reliability Engineering
* Experience defining and tracking data quality KPIs, operational KPIs, and SLAs to measure data reliability and performance
* Hands-on experience using Azure Monitor, Log Analytics, and writing KQL queries to collect monitoring data and define alert rules
* Experience writing SQL and PowerShell (or another scripting language) to automate data validation, reconciliation, and rule execution
* Exposure to data validation frameworks such as Great Expectations, Soda, or custom SQL/PySpark rule engines
* Experience validating pipelines and PySpark notebooks developed by data engineering teams across Fabric Data Factory, Azure Data Factory, and Synapse
* Experience defining and documenting validation rules, operational testing guidelines, and reliability processes for consistent team adoption