Key Responsibilities
Design, develop, and maintain ETL pipelines for structured and semi-structured data
Build scalable data transformations using Python and PySpark on Databricks
Optimize Spark jobs for performance, reliability, and cost efficiency
Work with data stored in cloud-based data lakes and warehouses
Implement data quality checks, monitoring, and error handling
Collaborate with analytics, data science, and product teams to deliver clean datasets
Support or develop basic UI components using React for dashboards or internal tools
Participate in code reviews, Agile ceremonies, and continuous improvement initiatives
Required Skills & Qualifications
Strong experience with ETL development and data pipelines
Proficiency in Python for data engineering
Hands-on experience with PySpark and Apache Spark concepts
Practical experience working with Databricks (jobs, notebooks, workflows)
Good understanding of data modeling, transformations, and performance tuning
Experience working in Agile/Scrum environments
Basic knowledge of React (components, props, state, REST API integration)