Job Description
Role: Data Scientist - Multimodal LLMs (Speech Focus)
About the Role?
ConnexAI is developing an ambitious new product to enhance our large language models with speech-to-speech capabilities. This greenfield project offers a unique opportunity to help define its research direction and build the machine learning systems that will power it.
We’re seeking a data scientist with a strong research background in machine learning and a focus on speech or multimodal systems. In this role, you’ll work at the intersection of speech and language technologies, exploring how to integrate these modalities into deployable models. You’ll collaborate closely with engineers, researchers, and product leaders to design, prototype, and deploy state-of-the-art models.
What You'll Be Doing?
1. Researching the state-of-the-art approaches for incorporating audio data into multimodal LLMs for speech-to-text, text-to-speech, and speech-to-speech tasks
2. Implementing and adapting techniques from recent academic papers into practical, production-ready solutions
3. Training and fine-tuning models, and iterating on architectures to improve performance and scalability
4. Sourcing, curating, and preparing datasets for model training and evaluation
5. Defining evaluation metrics and te...