Principal AI Engineer - ML Ops
About the Team
The AI Center of Excellence team includes Data Scientists and AI Engineers that work together to conduct research, build prototypes, design features and build production AI components and systems. Our mission is to leverage the best available technology to protect our customers' attack surfaces. We partner closely with Detection and Response teams, including our MDR service, to leverage AI/ML for enhanced customer security and threat detection. We operate with a creative, iterative approach, building on 20+ years of threat analysis and a growing patent portfolio. We foster a collaborative environment, sharing knowledge, developing internal learning, and encouraging research publication. If you’re passionate about AI and want to make a major impact in a fast-paced, innovative environment, this is your opportunity.
The technologies we use include:
1. AWS for hosting our research environments, data, and features
2. EKS to deploy applications
3. Terraform to manage infrastructure
4. Python for analysis and modeling, taking advantage of numpy and pandas for data wrangling.
5. Jupyter notebooks (locally and remotely hosted) as a computational environment
6. Sci-kit learn for building machine learning models
7. Anomaly detection methods to make sense of unlabeled data
About the Role
Rapid7 is seeking a Principal AI Engineer to join our team as we expand and evolve our growing AI and MLOps efforts. You should have a strong foundation in applied AI R&D, software engineering, and MLOps and DevOps systems and tools. Further, you’ll have a demonstrated track record of taking models created in the AI R&D process to production with repeatable deployment, monitoring and observability patterns. In this intersectional role, you will combine your expertise in AI/ML deployments, cloud systems and software engineering to enhance our product offerings and streamline our platform's functionalities.
In this role, you will:
8. Architect and manage the end-to-end design of ML production systems, including project scoping, data requirements, modeling strategies, and deployment
9. Develop and maintain data pipelines, manage the data lifecycle, and ensure data quality and consistency throughout
10. Assure robust implementation of ML guardrails and manage all aspects of service monitoring
11. Develop and deploy accessible endpoints, including web applications and REST APIs, while maintaining steadfast data privacy and adherence to security best practices and regulations
12. Share expertise and knowledge consistently with internal and external stakeholders, nurturing a collaborative environment and fostering the development of junior engineers
13. Embrace agile development practices, valuing constant iteration, improvement, and effective problem-solving in complex and ambiguous scenarios
The skills you’ll bring include:
14. 15 years experience as a Software Engineer with 3-5 years focused on gaining expertise in ML deployment (especially in AWS)
15. Solid technical experience in the following is required:
Software engineering: developing APIs with Flask or FastAPI, paired with strong Python knowledge
DevOps and MLOps: Designing and integrating scalable AI/ML systems into production environments, CI/CD tooling, Docker, Kubernetes, cloud AI resource utilization and management
Pipelines, monitoring, and observability: Data pre-processing and feature engineering, model monitoring and evaluation
16. A growth mindset - welcoming the challenge of tackling complex problems with a bias for action
17. Strong written and verbal communication skills - able to effectively communicate technical concepts to diverse audiences and creating clear documentation of system architectures and implementation details
18. Proven ability to collaborate effectively across engineering, data science, product, and other teams to drive successful MLOps initiatives and ensure alignment on goals and deliverables.
19. Established track record of mentoring and guiding junior engineers, fostering their technical growth and promoting engineering excellence within the organization
Experience with the following would be advantageous:
20. AI and ML models, understanding their operational frameworks and limitations
21. Deploying resources that enable data scientists to fine tune and experiment with LLMs
22. Implementing model risk management strategies, including model registries, concept/covariate drift monitoring, and hyperparameter tuning
We know that the best ideas and solutions come from multi-dimensional teams. That’s because these teams reflect a variety of backgrounds and professional experiences. If you are excited about this role and feel your experience can make an impact, please don’t be shy - apply today.