About Bull
Bull is the Atos Group brand for high-performance computing, artificial intelligence and quantum innovations with 2,500 employees. Built on an open, end-to-end and trusted foundation, Bull designs, deploys and runs hardware and software while providing strategic services that unlock enterprise value, accelerate scientific research and drive society forward. Driven by world‑class R&D with 1,500 patents, manufacturing capabilities and data science, Bull enables nations and industries to fully control their AI and data, advancing progress for the benefit of the planet.
About Atos Group
Atos Group is a global leader in digital transformation with c. 63,000 employees and annual revenue of c. €8 billion, operating in 61 countries under two brands — Atos for services and Eviden for products. European number one in cybersecurity, cloud and high‑performance computing, Atos Group is committed to a secure and decarbonised future and provides tailored AI‑powered, end‑to‑end solutions for all industries.
Location
Primarily on‑site at a customer facility near Reading, Berkshire, with occasional support for additional HPC installations across Europe.
Requirement
Must be eligible for UK DV Security Clearance.
About The Role
Bull’s High‑Performance Computing (HPC), Artificial Intelligence & Quantum Business Unit is seeking a Hybrid Hardware & Software Support Engineer to join our HPC Services team. This is a highly visible, customer‑facing operational role supporting advanced HPC infrastructures in the UK. You will work across computing, storage, and networking layers, ensuring the deployment, stability, and performance of large‑scale Linux‑based systems. While prior HPC experience is an advantage, it is not mandatory – strong Linux and infrastructure engineers eager to grow into HPC & AI are encouraged to apply.
Key Responsibilities
Deployment & System Bring‑Up
* Install, configure, and integrate HPC cluster components (compute, storage, networking).
* Perform system installation, initial configuration, and operational readiness checks.
* Apply patches, updates, and conduct routine maintenance activities.
Hybrid Hardware & Software Support
* Provide Level 1 and Level 2 operational support for HPC environments.
* Diagnose and resolve issues involving
o Linux operating systems
o Enterprise server hardware
o High‑speed interconnects
o Storage subsystems
* Conduct root cause analysis and implement corrective actions.
* Escalate appropriately within the global support organisation when needed.
Operations & Incident Handling
* Monitor system health and respond to incidents proactively.
* Perform troubleshooting in secure, mission‑critical environments.
* Maintain detailed and accurate documentation of incidents and resolutions.
Customer Interface
* Act as the primary technical contact on‑site.
* Communicate effectively regarding incidents, planned maintenance, and system status.
* Build trusted relationships with customer technical stakeholders.
* Represent Bull professionally in sensitive and high‑profile environments.
Core Technical Skills
* Strong Linux expertise (RedHat and/or Debian‑based environments)
* Solid understanding of enterprise server hardware (CPU, memory, storage, diagnostics)
* Scripting skills in Bash and/or Python
* Strong networking fundamentals (TCP/IP, routing, switching, security basics)
* Hands‑on experience with infrastructure deployment, configuration, and maintenance
* Excellent troubleshooting and analytical abilities
* Proactive mindset and ability to work independently
Desirable Skills & Experience
* Experience with HPC clusters
* High‑speed networking (40/100GbE, InfiniBand)
* Virtualisation technologies (KVM, OpenStack)
* Storage systems (Ceph, SAN/NAS)
* Parallel filesystems (Lustre, GPFS, BeeGFS)
* Containers (Docker, Podman, Kubernetes)
* Configuration management (Ansible, Puppet)
* Monitoring and observability tools (Prometheus, Grafana, Icinga)
* Workload managers (Slurm, PBS Pro)
* Git version control
Candidate Profile
* Is hands‑on, operationally focused, and detail oriented
* Thrives in secure, mission‑critical environments
* Approaches troubleshooting methodically, even under pressure
* Communicates clearly with both technical and non‑technical stakeholders
* Takes full ownership of incidents through to resolution
* Is motivated to learn continuously and expand their technical expertise
Education & Experience
* Option 1: Degree in Computer Science, Engineering, or related field + at least 2 years of relevant experience
* Option 2: 5+ years of relevant industry experience
Strong early‑career candidates with solid technical foundations will also be considered.
Benefits
* Working on advanced HPC and digital infrastructure projects
* Continuous learning and technical skill development
* Career growth within a global technology organisation
* Participation in internal initiatives and community‑focused activities.
What happens next?
* Your application will be reviewed (1‑2 business days)
* Short‑listed candidates will be contacted for a discussion with HR
* Interview with management team
* Feedback (1‑10 business days after the interview)
Let’s grow together.
#J-18808-Ljbffr