Jobs
My ads
My job alerts
Sign in
Find a job Career Tips Companies
Find

Platform manager

Telford
World Wide Technology
Manager
Posted: 14h ago
Offer description

World Wide Technology (WWT), a global technology integrator and IT solutions provider. World Wide Technology, established in 1990 in St. Louis, Missouri, collaborates with OEMs like Cisco and Dell EMC to offer infrastructure security and custom app development services to Fortune 500 companies in various sectors. With over 10,000 employees globally, we generate $17 billion in annual revenue and operate in regions including the US, UK, Canada, Europe, Costa Rica, APAC, and Middle East. Were proud to be consistently recognized as a top employer by Fortune and Glassdoor for over 13 years.


World Wide Technology Holding Co, LLC (WWT) has an opportunity for a AI Platform Operations Manager. Below is the JD for the role. Please help me with the information needed to process your application, and I will contact you to discuss the role.


This is a contract Role & Outside IR35


Role: AI Platform Operations Manager


Location: United Kingdom. Remote with some travel to London


Contract Duration: 6 Months


Key Responsibilities

* L1 support for customer-reported issues and requests
* L2 support by diagnosing, replicating, and troubleshooting issues across platform and infrastructure.
* Coordinate resolution of complex issues (L3) to (vendor) product/engineering teams and manage vendor responses.
* Monitor system health, alerts, and customer usage patterns.
* Document solutions/workarounds, create and maintain knowledge, document support procedures.
* Automate common tasks and fixes.
* Configure and integrate tooling to support optimal operation of the platform, and support tool selection.
* Assist customers with platform configuration, onboarding, and usage best practices.
* Collaborate with platform and infrastructure support/engineering teams to resolve platform integration issues.
* Ensure SLAs and customer satisfaction targets are met.
* Work with customers and multiple stakeholders to understand requirements and challenges, provide reporting on usage, workflow and billing.


Technical responsibilities

* Cluster Infrastructure management: Managing the Nvidia GPU cluster.
* High availability and resilience: Implement failover strategies and manage maintenance events to minimize downtime.
* Resource allocation and optimization: Resource partitioning (GPU resources), workload scheduling, capacity planning
* Performance monitoring and troubleshooting: Performance analysis, monitoring (Realtime) with available Nvidia and HPE tools
* Incident response: node failure management, network issues, driver issues, troubleshooting common issues and then working with vendor support to resolve any critical issues.
* Security and access control: Manage user permissions, RBAC, security hardening, data protection.


Required Skills & Experience

* 10 years of experience (or equivalent) in technical support, system engineering, or platform operations
* Strong understanding of L1 and L2 support processes (ticketing, escalation, troubleshooting)
* Familiarity with cloud-based platforms, APIs, and distributed systems
* Understanding of AI/ML concepts and tooling (model training, inference, data pipelines basics)
* Experience with monitoring/logging tools (e.g., Grafana, Kibana, Splunk)
* Excellent communication skills to interface with both customers and internal / vendor teams
* Good understanding of tools requirements for ML engineers and data scientists, and how to optimize the experience.


Core Technical skills:

* System administration experience with OSs like RHEL/CentOS, Ubuntu, tuning Linux kernel
* Proficiency with Ansible, Nvidia and CUDA toolkits, Kubernetes, and container orchestration
* Understanding of automation, monitoring, and security with GPU as a service


Preferred experience

* Experience supporting HPE PCAI or other AI/HPC infrastructure and platforms.
* Experience with GPU resource allocation (across instances, GPUs count and time)
* Advanced networking skills with High performance networking, troubleshooting and fine tuning.
* Background in DevOps or SRE practices
* ITIL familiarity


All candidates will need to go through a background check.


Must be eligible to live and work in the UK. Only successful candidates will be contacted.


EQUAL OPPORTUNITIES

World Wide Technology is committed to equal opportunities and actively seeks applications from all sectors of the community irrespective of sex, race, color, nationality, ethnic or national origin, disability, marital status, sexual orientation, having responsibility for dependents, age, religion/beliefs, or any other reason which cannot be shown to be justified.


Equal Opportunity Employer Minorities/Women/Veterans/Disabled

Apply
Create E-mail Alert
Job alert activated
Saved
Save
Similar job
Development & regeneration manager
Shrewsbury
Shropshire Towns and Rural Housing
Manager
Similar job
It and automation manager
Shrewsbury
Shropshire Towns and Rural Housing
Manager
Similar job
Registered manager children home
Wolverhampton (West Midlands)
RJS Resourcing Ltd
Manager
£50,000 a year
See more jobs
Similar jobs
Management jobs in Telford
jobs Telford
jobs Shropshire
jobs England
Home > Jobs > Management jobs > Manager jobs > Manager jobs in Telford > Platform Manager

About Jobijoba

  • Career Advice
  • Company Reviews

Search for jobs

  • Jobs by Job Title
  • Jobs by Industry
  • Jobs by Company
  • Jobs by Location
  • Jobs by Keywords

Contact / Partnership

  • Contact
  • Publish your job offers on Jobijoba

Legal notice - Terms of Service - Privacy Policy - Manage my cookies - Accessibility: Not compliant

© 2025 Jobijoba - All Rights Reserved

Apply
Create E-mail Alert
Job alert activated
Saved
Save