Senior reinforcement learning engineer – environment design & sector expansion

Cheltenham

Blueberry Capital

Engineer

Posted: 6h ago

Offer description

Location: Remote / Hybrid

(Preferred: EU-based)

Type: Full-time

Compensation: Competitive

with equity options

Clearance: Not required, but experience with defence, healthcare, or robotics data is a plus

About Us

Diambra is a cutting-edge reinforcement learning (RL) competition platform designed for multi-agent systems. Today, we're focused on high-intensity gaming environments where developers deploy RL agents that compete in classic video game arenas.

Our goal: build a proving ground for the future of intelligent systems.

We're now expanding beyond games — into robotics, healthcare, and other real-world sectors where RL has transformative potential. Our platform will become a crowdsourced engine for solving high- Impact problems across industries, driven by the world’s top RL developers.

The Role

We’re looking for a Senior Reinforcement Learning Engineer who can bridge the gap between deep RL infrastructure and real-world applications.

You’ll work closely with our internal engineering team, research advisors, and strategic partners to define, build, and validate new multi-agent RL environments in sectors like robotics, autonomous navigation, medical workflows, or logistics.

This role is ideal for someone who wants to shape the next generation of open RL challenges, building meaningful benchmarks and environments that matter outside of simulation.

Responsibilities

- Analyze and contribute to Diambra's existing RL environment and tournament infrastructure

- Collaborate with partner organizations in robotics, healthcare, defense, etc. to scope out RL-relevant problems

- Design and implement custom environments tailored to real-world constraints, with clean interfaces for developer participation

- Work with simulation tools (e.g. MuJoCo, Isaac Gym, Unity, ROS, CARLA) to translate physical-world problems into virtual testbeds

- Build modular APIs and wrappers for custom observation/action spaces, reward shaping, and environment resets

- Help define metrics, benchmarks, and evaluation protocols for crowdsourced RL competitions

- Engage with our developer community to validate environments and iterate based on feedback

Requirements

- 5+ years experience working with reinforcement learning in applied or research settings

- Strong Python skills with deep familiarity in at least one RL framework: Stable-Baselines3, RLlib, CleanRL, or custom training loops

- Experience designing or modifying RL environments (e.g. Gym, PettingZoo, IsaacGym, Unity ML-Agents, or custom simulators)

- Ability to collaborate with external stakeholders and partners from academia, startups, or industry

- Excellent written and verbal communication skills; able to document and communicate environment specs clearly

Bonus Points For

- Experience working on RL for robotics, healthcare, aerospace, supply chains, or defense

- Familiarity with multi-agent learning, MARL frameworks, and decentralized training

- Open-source contributions in the RL or simulation ecosystem

- Prior participation or hosting of RL competitions (e.g. NeurIPS, AIcrowd, Kaggle)

- Familiarity with Web3/game mechanics for incentive design (for gaming-oriented environments)

Why Join Us?

- Be part of the next wave of AI infrastructure, enabling RL agents to move from games to real-world utility

- Work at the intersection of research, product, and community in a deeply technical environment

- Collaborate with leading institutions, engineers, and domain experts across sectors

- Shape open RL benchmarks that can advance progress in robotics, healthtech, and more

Apply

Create E-mail Alert

Save

Similar job

Dv cleared - cisco collaboration engineer

Cheltenham

Meridian Business Support

Engineer

Similar job

Biomass engineer

Gloucester

Office Owls Recruitment Limited

Engineer

Similar job

Biomass engineer

Gloucester

Office Owls Recruitment Limited

Engineer

£40,000 a year