At CommonAI CIC, we are a dynamic non-profit membership organisation driven by the vision of collaborative engineering to advance the safe and responsible development of foundational AI technologies. Our community brings together AI startups, large and small enterprises, public sector organisations and academia, all dedicated to sharing resources and knowledge to co-create and rapidly scale innovative businesses.
We are looking for a talented Senior Linux Systems Administrator to become a key player in our fast-growing engineering team, responsible for provisioning and maintaining robust multi-rack GPU server clusters tailored for both inference and training workloads. If you're a hands-on engineer who thrives at the cutting edge of high-end hardware and intricate software orchestration, this is a great opportunity to work with autonomy and influence.
Requirements
You should be able to demonstrate:
Significant experience working as a professional System Administrator or equivalent technical role
Advanced Linux skills and experience working with servers running hypervisors/VMs and multiple distributions (e.g. Ubuntu, RHEL)
Proficiency in scripting (e.g. Python, Bash), source code management (e.g. Git) and infrastructure-as-code tools (e.g. Terraform, OpenTofu)
A strong understanding of network design and switch/router/firewall configuration
The desire to work collaboratively with users and other stakeholders to iteratively optimise systems
The ability to travel to data centres and install rack-mounted equipment
Experience with any or all of the following will also be highly valued:
Modern GPU server deployment, tuning and management
High-performance or high-availability storage servers/clusters (Lustre, Ceph, NFS)
Advanced networking technologies (Infiniband, RDMA, RoCE)
HPC workload managers (Slurm, LSF)
Our infrastructure is used for research and development. Support will generally only be required during UK business hours however major maintenance may occasionally be scheduled for weekends.
Benefits
A collaborative and supportive work environment
The opportunity to have a high impact in a growing organisation
Competitive salary package and pension
Professional development opportunities
Networking opportunities with influential people from across the tech sector and academia
A vibrant office environment located a few minutes' walk away from Cambridge train station
CommonAI CIC is an equal opportunity employer and is committed to creating an inclusive and diverse workplace.