Systems Research Engineer - AI Infrastructure / Distributed Systems
We're partnered with a leading advanced technology research centre in the UK, focused on building next-generation AI-native infrastructure. They're seeking a Systems Research Engineer for AI Infrastructure and Distributed Systems to join them onsite in Edinburgh on a permanent basis.
As large language models continue to reshape the software stack, this team is pioneering scalable, high-performance systems for training and serving AI at data centre scale.
Sitting at the intersection of cutting-edge research and real-world deployment, the group transforms novel system architectures into production-ready technologies that will define the future of distributed AI.
This is an excellent opportunity for engineers with a strong systems background who want to work on complex, research-driven challenges in distributed infrastructure, AI serving, and performance optimisation.
The Role
You will design and build core components of distributed AI systems, working across infrastructure layers to improve scalability, efficiency, and performance of large-scale model serving environments.
Key Responsibilities:
Distributed Systems Engineering
* Design, implement, and evaluate distributed system components for AI and data-intensive workloads
* Build scalable infrastructure across heterogeneous environments (CPU, GPU, accelerators)
* Develop advanced scheduling and serving systems for large-scale AI workloads
Performance Optimisation
* Profile and optimise large-scale inference pipelines
* Improve key-value cache efficiency and memory scheduling
* Identify bottlenecks and enhance system scalability using systematic performance analysis
AI Serving Infrastructure
* Develop low-latency, multi-tenant, fault-tolerant model serving systems
* Work on areas such as cache sharing, data locality, and cluster scheduling
* Prototype and evaluate next-generation inference architectures
Research & Innovation
* Contribute to cutting-edge systems and ML research
* Publish at leading conferences and drive internal adoption of new approaches
Collaboration
* Work closely with global research and engineering teams
* Communicate technical findings and system insights clearly
Requirements
* BSc or MSc in Computer Science, Electrical Engineering, or related field
* Strong fundamentals in distributed systems and operating systems
* Experience with machine learning systems and AI inference infrastructure
* Hands-on experience with LLM serving frameworks
* Strong programming skills in C/C++
* Python for prototyping and experimentation
* Experience with performance profiling and optimisation tools
* Solid understanding of distributed algorithms
Nice to Have
* PhD in systems, distributed computing, or AI infrastructure
* Publications in top-tier systems or ML conferences
* Experience with load balancing, fault tolerance, and resource scheduling
* Background in large-scale cloud or AI infrastructure environments
Why Apply?
* Work on cutting-edge AI infrastructure challenges at scale
* Bridge research and real-world system deployment
* Collaborate with leading experts in distributed systems and AI
* Shape the future of large-scale AI systems
In accordance with local employment laws, applicants must have current, valid authorisation to work in the UK at the time of application. We are unable to sponsor employment visas for this role. Applications from individuals without existing work authorisation for the UK cannot be considered.
If this sounds interesting and you'd like to learn more, click the link below to apply or email me with a copy of your CV on
By applying to this role you understand that we may collect your personal data and store and process it on our systems. For more information please see our Privacy Notice (https://eu-recruit.com/about-us/privacy-notice/)