About the role: Join a specialist machine learning team working at the intersection of deep learning, model optimisation, and efficient deployment. You will help build and deploy advanced ML models for low-latency speech recognition and foundation LLMs, focusing on reducing power consumption while maximising performance. Your work will include: Training state-of-the-art models on production-scale datasets. Compressing and optimising models for accelerated inference on modern hardware. Researching and implementing innovative ML techniques tailored for efficient deployment. Deploying and maintaining customer-facing training libraries.Your initial focus will be on speech recognition models, where you will: Optimise training workflows for multi-GPU environments. Manage and execute large-scale training runs. Tune hyperparameters to improve both inference quality and performance.What you’ll be working on This is an end-to-end optimisation role, from algorithms through to deployment on modern silicon, with a mission to enable high-performance, low-power AI in production environments. You will work on deep technical challenges alongside engineers and researchers who care about efficiency, precision, and impact. What they're looking for: Strong practical experience in training deep learning models at scale. Knowledge of optimising ML workflows for multi-GPU environments. Experience with model compression, quantisation, and ...