Overview
This role is part of the Research Platforms team in IT Services, supporting the University’s research computing infrastructure including high-performance compute clusters, scalable storage, and cloud-based platforms. The role involves ensuring the reliability, availability and configuration of these platforms to meet the diverse needs of researchers.
Main Duties and Responsibilities
* Deliver, run and support HPC compute platforms that meet the spectrum of research applications.
* Manage high-performance, scalable storage systems for I/O intensive compute tasks.
* Develop platforms that provide secure environments for storing and processing sensitive research data.
* Support the use of cloud-based platforms and technologies by researchers.
* Manage and configure scheduling software and implement policies to allocate resources, including allowing research groups to purchase dedicated access.
* Carry out routine maintenance tasks and identify opportunities to improve and automate them across on-premises and cloud platforms.
* Monitor performance, availability and security of the research platforms, providing regular reports.
* Collaborate with other ITS team members to provide technical guidance and ensure appropriate platform delivery to meet researcher requirements.
* Assist research groups in migrating activities from legacy clusters to appropriate solutions.
* Develop innovative solutions that simplify access to the platform, lowering the barrier for users with limited HPC experience.
* Work with vendors and service providers to procure and maintain infrastructure and services that meet the University’s research computing requirements.
* Perform other duties as required at this grade.
Essential Criteria
* Understanding and experience of HPC and cloud-based research computing platforms, including job schedulers (e.g., Slurm), virtualisation (e.g., VMWare, AWS EC2) and containerisation (e.g., Kubernetes, Docker Swarm, Podman).
* Knowledge of multi-user systems, user account management, authentication and permissions.
* Ability to work within existing admin processes and develop robust automations with future resilience, replicability and management in mind. Knowledge of infrastructure-as-code technologies such as Puppet and CloudFormation is a distinct advantage.
* Effective communication skills, both written and verbal, including report writing and technical documentation.
* Understanding of monitoring and managing physical and virtual computing platforms and infrastructure.
* Understanding of how to use cloud infrastructure and networking to ensure system and data security.
* Ability to assess and organise resources, plan and progress work activities.
* Inquisitive mind and a desire to explore new technologies and engage with other research computing professionals.
Desirable Criteria
* Experience with high-performance storage systems and scalable I/O solutions.
* Experience with procurement and vendor management in a research computing environment.
Additional Information
Grade: G7
Line Manager: Research Platforms Engineering Lead
Direct reports: None
Employment Conditions
A basic DBS check will be required for this role.
We are a Disability Confident Employer.
Benefits
The University offers a competitive annual leave entitlement (including the ability to purchase additional days), a generous pension scheme, flexible working opportunities, commitment to professional development and wellbeing, retail discounts and family-friendly policies including paid time off for parenting and caring emergencies, support for menopause, fertility treatment and more. Full benefit details are available at https://www.sheffield.ac.uk/jobs/benefits.
#J-18808-Ljbffr