Job Description
As a Principal / Senior Data Engineer, you will spearhead our data initiatives, transforming raw data into strategic assets that propel our business forward. This role is designed for those who excel in a dynamic environment, offering the freedom to innovate and the responsibility to deliver.
Core Responsibilities:
1. Identify and harness high-value data opportunities within our product suite, converting raw data into impactful features and reusable assets.
2. Act as the authoritative voice on data technology, steering our organization towards cutting-edge platforms and methodologies.
3. Architect and execute comprehensive data projects, ensuring robust scalability and performance from conception to deployment .
4. Collaborate with a team of ML Engineers, Data Scientists, and other key stakeholders to design, implement, and maintain data pipelines that fuel advanced analytics and AI-driven solutions .
5. Embrace the role of a data evangelist, presenting at industry conferences, leading webinars, and authoring thought-leadership content to share knowledge and influence the broader tech community.
Qualifications
6. A minimum of 5 years in data engineering, with expertise in scalable architectures such as Data Lakes, Graph, and Vector Databases (, ADLS, neo4j, Elasticsearch).
7. Proficiency in developing data pipelines across diverse environments, leveraging Azure and other modern technologies.
8. Proven ability to orchestrate complex data workflows and manage Kubernetes clusters on AKS, utilizing tools like Airflow, Kubeflow, Argo, and Dagster.
9. Familiarity with data ingestion tools such as Airbyte and Fivetran, accommodating a wide array of data sources.
10. Mastery of large-scale data processing techniques using Spark or Dask.
11. Strong programming skills in Python, Scala, C#, or Java, and adeptness with cloud SDKs and APIs.
12. Deep understanding of AI/ML to enhance pipeline efficiency, with experience in TensorFlow, PyTorch, AutoML, Python/R, and MLOps platforms like MLflow and Kubeflow.
13. Solid background in DevOps, including CI/CD automation with Bitbucket Pipelines, Azure DevOps, and GitHub.
14. Capability to automate the deployment of data pipelines and applications using scripting languages and infrastructure as code tools such as Bash, PowerShell, Azure CLI, Terraform, and Helm Charts.
15. Experience in utilizing Azure AI Search or Elasticsearch for sophisticated content analysis and indexing, particularly in developing RAG services with Langchain.
16. Proficiency in crafting IoT data pipelines, ensuring real-time processing, security, and integration with IoT ecosystems.
17. Skill in designing, developing, and monitoring streaming data applications with Kafka and related technologies.
18. Commitment to implementing and upholding data governance policies and standards throughout the data platform.
Personal Competencies:
19. A results-oriented professional with a fervent passion for innovation and a self-motivated, proactive approach.
20. Exceptional organizational skills, fluent in spoken and written English, and driving creative solutions .
21. A strategic thinker, constantly exploring the next significant advancement in data technology.
We invite you to join our team and lead the way in data engineering, where your expertise will be pivotal in shaping the future of our data-driven endeavors.