Job Description
Role Overview
We are developing a next-generation data platform and are looking for an experienced Senior Data Engineer to help shape its architecture, reliability, and scalability. The ideal candidate will have more than 10 years of hands-on engineering experience and a strong background in building modern data pipelines, working with cloud-native technologies, and applying robust software engineering practices.
Key ResponsibilitiesDesign & Build Data Pipelines
1. Design, develop, and optimise scalable, testable data pipelines using Python and Apache Spark.
2. Implement batch workflows and ETL processes adhering to modern engineering standards.
Develop Cloud-Based Workflows
3. Orchestrate data workflows using AWS services such as Glue, EMR Serverless, Lambda, and S3.
4. Contribute to the evolution of lakehouse architecture leveraging Apache Iceberg.
Apply Software Engineering Best Practices
5. Use version control, CI/CD pipelines, automated testing, and modular code principles.
6. Participate in pair programming, code reviews, and architectural design sessions.
Data Quality & Observability
7. Build monitoring and observability into data flows.
8. Implement basic data quality checks and contribute to continuous improvements.
Stakeholder Collaboration
9. Work closely with business teams to translate requirements into data-driven solutions.
10. Develop an understanding of financial indices and share domain insights with the team.
What You’ll BringTechnical Expertise
11. Strong experience writing clean, maintainable Python code, ideally using type hints, linters, and test frameworks such as pytest.
12. Solid understanding of data engineering fundamentals including batch processing, schema evolution, and ETL pipeline development.
13. Experience with—or strong interest in learning—Apache Spark for large-scale data processing.
14. Familiarity with AWS data ecosystem tools such as S3, Glue, Lambda, and EMR.
Ways of Working
15. Comfortable working in Agile environments and contributing to collaborative team processes.
16. Ability to engage with business stakeholders and understand the broader context behind technical requirements.
Nice-to-Have Skills
17. Experience with Apache Iceberg or similar table formats (e.g., Delta Lake, Hudi).
18. Familiarity with CI/CD platforms such as GitLab CI, Jenkins, or GitHub Actions.
19. Exposure to data quality frameworks such as Great Expectations or Deequ.
20. Interest or background in financial markets, index data, or investment analytics.