Data Scientist
We're looking for a Data Scientist to help establish the quantitative foundation of a cutting-edge trust and validation framework for autonomous systems. In this role, you'll design rigorous statistical methodologies to evaluate system performance, develop confidence and reliability metrics, and support high-scale deployment with robust measurement systems. Your work will be critical in validating performance in high-stakes domains and enabling data-driven decisions as the platform scales from early users to millions of interactions per month.
Responsibilities
1. Design statistical frameworks to validate autonomous system performance with academic rigor
2. Develop mathematical models to quantify trust, reliability, and performance in complex domains
3. Build autoscaling algorithms for compute resource optimization at scale
4. Create projection models for quota growth and capacity planning across multi-region deployments
5. Establish methodologies to measure system composition, including dynamic and contextual behavior
6. Design systems for context traceability and statistical validation of reasoning pathways
7. Develop confidence calculation methods across simulation runs and deployment conditions
8. Create judge coverage frameworks for comprehensive performance evaluation
9. Define metrics tied to interpretability, safety, and business outcomes
10. Design attribution systems that identify key components contributing to system performance
11. Model capability expansion to measure growth while maintaining reliability
12. Collaborate with verification and simulation teams to define evaluation standards
13. Contribute to academic publications and technical content showcasing scientific rigor
14. Work with engineering teams to implement statistical measurement systems in production
Qualifications
15. Advanced degree in statistics, data science, applied mathematics, or related field
16. Strong foundation in statistical methods, experimental design, and measurement frameworks
17. Experience applying quantitative approaches to complex system evaluation
18. Background in building performance metrics for AI or software systems
19. Proficient in confidence intervals, variance analysis, and statistical validation
20. Experience designing experiments to quantify behavior across variable conditions
21. Skilled in Python, statistical tools, and data analysis libraries
22. Ability to connect metrics to business impact and technical performance
23. Experience with data visualization for communicating complex concepts
24. Academic or industry publication experience is a plus
25. Passion for scientific rigor and trustworthy evaluation in AI systems