Role Responsibilities
You will be responsible for:
* Collaborating with cross-functional teams to understand data requirements, and design efficient, scalable, and reliable ETL processes using Python and DataBricks
* Developing and deploying ETL jobs that extract data from various sources, transforming it to meet business needs.
* Taking ownership of the end-to-end engineering lifecycle, including data extraction, cleansing, transformation, and loading, ensuring accuracy and consistency.
* Creating and manage data pipelines, ensuring proper error handling, monitoring and performance optimizations
* Working in an agile environment, participating in sprint planning, daily stand-ups, and retrospectives.
* Conducting code reviews, provide constructive feedback, and enforce coding standards to maintain a high quality.
* Developing and maintain tooling and automation scripts to streamline repetitive tasks.
* Implementing unit, integration, and other testing methodologies to ensure the reliability of the ETL processes
* Utilizing REST APls and other integration techniques to connect various data sources
* Maintaining documentation, including data flow diagrams, technical specifications, and processes.
You Have:
* Proficiency in Python programming, including experience in writing efficient and maintainable code.
* Hands-on experience with cloud services, especially DataBricks, for building and managing scalable data pipelines
* Proficiency in working with Snowflake or similar cloud-based data warehousing solutions
* Solid understanding of ETL principles, data modelling, data warehousing concepts, and data integration best practices
* Familiarity with agile methodologies and the ability to work collaboratively in a fast-paced, dynamic environment.
* Experience with code versioning tools (e.g., Git)
* Meticulous attention to detail and a passion for problem solving
* Knowledge of Linux operating systems
* Familiarity with REST APIs and integration techniques
You might also have:
* Familiarity with data visualization tools and libraries (e.g., Power BI)
* Background in database administration or performance tuning
* Familiarity with data orchestration tools, such as Apache Airflow
* Previous exposure to big data technologies (e.g., Hadoop, Spark) for large data processing
* Experience with ServiceNow integration