Job Description
The Data Engineer will be implementing data ingestion and transformation pipelines for large-scale organizations. We are seeking someone with deep technical skills in a variety of technologies to play an important role in developing and delivering early proofs of concept and production implementation.
You will be building solutions using a variety of open-source tools & Microsoft Azure services, and a proven track record in delivering high-quality work to tight deadlines.
Your main responsibilities will be:
* Designing and implementing highly performant data ingestion & transformation pipelines from multiple sources using a variety of technologies
* Delivering and presenting proofs of concept of key technology components to prospective customers and project stakeholders
* Developing scalable and re-usable frameworks for ingestion and transformation of large data sets
* Master data management system and process design and implementation
* Data quality system and process design and implementation
* Integrating the end-to-end data pipeline to take data from source systems to target data repositories, ensuring the quality and consistency of data is maintained at all times
* Working with event-based / streaming technologies to ingest and process data
* Working with other members of the project team to support the delivery of additional project components (Reporting tools, API interfaces, Search)
* Evaluating the performance and applicability of multiple tools against customer requirements
* Working within an Agile delivery / DevOps methodology to deliver proof of concept and production implementation in iterative sprints
Qualifications
* Hands-on experience designing and delivering solutions using the Azure Data Analytics platform (Cortana Intelligence Platform) including Azure Storage, Azure SQL Database, Azure SQL Data Warehouse, Azure Data Lake, Azure Cosmos DB, Azure Stream Analytics
* Direct experience in building data pipelines using Azure Data Factory and Apache Spark (preferably Databricks)
* Experience building data warehouse solutions using ETL / ELT tools such as SQL Server Integration Services (SSIS), Oracle Data Integrator (ODI), Talend, and Wherescape Red
* Experience with Azure Event Hub, IoT Hub, Apache Kafka, Nifi for streaming data / event-based data
* Experience with other Open Source big data products e.g., Hadoop (including Hive, Pig, Impala)
* Experience with Open Source non-relational / NoSQL data repositories (including MongoDB, Cassandra, Neo4J)
* Experience working with structured and unstructured data, including imaging & geospatial data
* Comprehensive understanding of data management best practices, including data profiling, sourcing, and cleansing routines involving standardization, transformation, rationalization, linking, and matching
* Experience working in a DevOps environment with tools such as Microsoft Visual Studio Team Services, Chef, Puppet, or Terraform
#J-18808-Ljbffr