Job Description
CACI is looking for a Data Product Engineer within the Product Strategy & Implementation team who will help improve the existing data product refresh processes and mitigate data products.
Who we are:
At CACI we do amazing things with data. We are experts in all things consumer and location, bringing together cutting-edge analytical techniques, creative thinking and diverse perspectives to drive growth for our clients. We build some of the most highly regarded, innovative datasets in the market and our people are the best at manipulating that data to provide insight to our clients. As part of the wider Marketing Solutions Division you will be joining a 250 strong team working in over 50 markets globally.
Our people are what really make us different. We are a growing and dynamic group of analysts, data scientists and commercially savvy business development consultants who also provide thought leadership and creative thinking. We are passionate, progressive and unafraid of challenge; our mission is to use data-driven insight to make a commercial difference.
The role
The purpose of this role is:
* Perform regular data product refreshes
* Help improve the existing data product refresh processes
* Help the migration of data products
* From SAS to Python
* From on-premises to AWS
Key Responsibilities
* Follow existing workflows to refresh a subset of our existing data products
* Refresh existing internal and external input data sets, often sourced from surveys
* Source control figures from both industry and government sources
* Re-fit ML models on the latest data
* Follow QA processes throughout
* Produce updated documentation such as Process Documentation, Release Notes and Technical Guides
* Upload deliverables (outputs and documentation) to both internal and AWS resources
* Resolve any identified issues with the existing products and/or refresh processes
* Look for opportunities to improve the product refresh process, particularly greater use of automation
* The existing data products comprise extensive use of SAS, csv, some use of SQL, Excel and increasing use of Python
* There are also extensive dependencies between the products, which means that evolution of the products must be done with careful planning
* There is currently significant manual involvement during the processes, which we are seeking to reduce where possible
* Assist in the migration of the data products from SAS to Python
* Most of our code base is in SAS
* Going forward, major redevelopments of our products will take place in Python
* Contribute ideas to assist the migration of on-premises product processes to AWS
* We are in the early stages of migrating our products and update processes to AWS; and any experience to inform the migration is welcome
Skills and Experience
* Experience with SAS, Python, SQL, version control ( ideally Git/GitLab) and running in-production data product updates
* Good analytical and problem-solving skills, attention to detail and a focus on standardisation and delivery
* It would be advantageous, but not essential, to have experience with any of the following tools: Visual Studio Code, Power BI, Linux, Airflow.
* An appreciation of the main concepts of Data Science/Machine Learning would be useful. We use a range of predictive analytics and machine learning methodologies, including logistic regression and cluster analysis, plus some predictive time series analysis.