Job Description
A pioneering research-focused engineering team is seeking Systems Research Engineers to help shape the next generation of AI-native infrastructure. This role sits at the intersection of systems research and large-scale engineering, focusing on distributed architectures that support the training, serving, and deployment of advanced AI models.
The rapid evolution of large-scale AI models is transforming how modern computing systems are designed and deployed. A highly advanced research-driven engineering group is building the next wave of infrastructure that powers intelligent systems at scale—redefining how models are trained, served, and optimised across distributed environments.
This role offers a unique blend of hands-on engineering and forward-looking research, ideal for engineers who want to push the boundaries of distributed systems and AI infrastructure while working on real-world, high-impact platforms.
What You’ll Work On
* Build and experiment with distributed system components tailored for data-intensive and AI-driven workloads.
* Design scalable infrastructure capable of operating across diverse hardware environments including CPUs, GPUs, and accelerators.
* Develop high-performance model serving systems with a focus on efficiency, scalability, and resilience.
* Analyse system behaviour using profiling tools to uncover performance bottlenecks and optimisation opportunities.
* Improve memory usage, caching strategies, and scheduling efficiency in large-scale inference systems.
* Create solutions that enable low-latency, multi-tenant AI services in distributed environments.
* Explore and prototype new approaches to inference architecture and cluster-level orchestration.
* Translate technical innovations into tangible outcomes, including internal adoption and external publications.
* Work closely with global teams to shape long-term infrastructure direction and strategy.
What You’ll Bring
* PhD in Computer Science, Electrical Engineering, or a related discipline.
* Strong foundation in distributed systems and operating systems principles.
* Understanding of machine learning infrastructure and large-scale model serving.
* Experience with systems-level programming in C/C++.
* Proficiency in Python for experimentation and rapid prototyping.
* Familiarity with distributed algorithms and system design trade-offs.
* Experience using performance analysis and profiling tools.
* Ability to communicate complex ideas clearly and work effectively in collaborative environments.
* Doctoral research in distributed systems, large-scale infrastructure, or AI platforms.
* Contributions to recognised systems or machine learning conferences.
* Hands-on experience with load balancing, fault tolerance, or cluster scheduling.
* Exposure to distributed caching, state management, or high-performance cloud systems.
* Experience building or optimising large-scale AI or cloud infrastructure.
Why This Role Stands Out
* Be part of a team shaping the infrastructure behind next-generation AI systems.
* Work on problems that combine deep technical research with real-world deployment.
* Gain exposure to cutting-edge architectures in distributed computing and AI.
* Collaborate with globally recognized experts in systems and machine learning.
* Opportunity to publish, innovate, and influence future technology directions.
* Accelerate your career in one of the fastest-growing areas of technology.
If you’re a motivated and skilled professional ready for your next challenge, apply now or send your CV to nk@eu-recruit.com