Senior Machine Learning Engineer
Location: London, UK
About the Role
We're looking for an experienced Machine Learning Engineer to lead the development and training of advanced large-scale language models. In this role, you will be responsible for pushing the performance and reliability of next-generation AI systems, specifically focusing on models that assist with complex real-world tasks. You'll work closely with cross-functional teams including infrastructure, product and research to shape both the training pipeline and the evaluation of highly capable models.
Key Responsibilities
Design and execute large-scale training experiments on multi-GPU and distributed environments using cutting-edge ML frameworks.
Lead both supervised fine-tuning (SFT) and reinforcement learning (RL) workflows to improve model performance on domain-specific tasks.
Build, maintain, and optimise custom training pipelines, including dataset preparation, distributed training primitives, and scheduling of multi-node jobs.
Collaborate across engineering and research teams to align training goals with product priorities and performance metrics.
Troubleshoot training challenges such as stability, scaling issues, and GPU utilisation bottlenecks.
What We're Looking For
Experience: 3–5+ years working in ML engineering or applied machine learning roles, with hands-on responsibility for training and deploying models in production-like environments.
Technical Skills:
Strong proficiency with PyTorch including distributed training (e.g., DDP/FSDP).
Practical experience training large sequence models or transformer-based architectures.
Comfortable building and maintaining data pipelines, optimising large datasets, and handling model scaling challenges.
Solid software engineering fundamentals — clean, maintainable code and version control best practices.
System Knowledge: Hands-on experience with multi-node GPU clusters, orchestration tools (e.g., Kubernetes, Slurm) and performance tuning.
Communication: Clear and effective communicator, able to share insights with both technical and non-technical stakeholders.
Desirable Qualities
Experience with reinforcement learning workflows and sequence-level reward strategies.
Familiarity with model evaluation tools and benchmarks for large-scale AI systems.
A proactive, collaborative mindset that thrives in a fast-moving environment where innovation and experimentation are central.