Job Description
Cloud Operations Engineer£30,000 to 50,000 GBPBonusOnsite WORKINGLocation: Cheltenham, Gloucester, South West - United Kingdom Type: PermanentCloud Operations Engineer / Lead Engineer (Multiple Roles)Location: Cheltenham (Onsite, 5 days per week)Levels: Multiple hires (junior to senior)Eligibility: UK Citizen and eligible for SC clearanceWorking Pattern: 24/7 shift-based operational environmentPackage: Competitive salary depending on experience plus shift allowanceOverviewWe are hiring multiple Cloud Operations Engineers and Lead Engineers to join a highly secure, mission-critical cloud operations team.The role is open to a broad range of backgrounds, including Computer Science graduates, Linux-focused infrastructure engineers, Kubernetes/platform engineers, and individuals from live service or service desk environments with strong incident management experience.This is a hands-on operational engineering role focused on maintaining stability, availability, and performance of a complex, secure cloud platform operating at scale.Key Responsibilities Provide frontline operational support for secure cloud infrastructure and platform usersTroubleshoot and resolve critical incidents across live production systemsLead or support incident response, escalation, and coordination during shiftsOperate within a 24/7 rota supporting high-priority workloads and servicesFollow, maintain, and improve operational runbooks and incident proceduresIdentify opportunities to reduce operational toil and improve service reliabilitySupport mentoring and knowledge sharing for junior engineers (senior roles)Engage with internal stakeholders and third parties during critical incidentsTechnical Environment Linux (strong hands-on experience required)Kubernetes (deployment, troubleshooting, and platform support)Infrastructure as Code (Terraform or similar tools)Cloud-native networking and system troubleshootingObservability and monitoring toolsAPIs and integration servicesSecure, restricted, air-gapped cloud environmentsRequired Experience Strong experience working with Linux-based systems in production environmentsBackground in live service support, infrastructure operations, or platform engineeringExperience troubleshooting system, application, or network-level issuesExposure to Kubernetes and/or containerised environmentsUnderstanding of infrastructure, networking, and operational support principlesAbility to operate in high-pressure, incident-driven environmentsWillingness to learn and operate within highly secure cloud architecturesDesirable Experience Kubernetes administration or advanced troubleshooting experienceInfrastructure as Code experience (Terraform or similar)Exposure to observability and monitoring platformsExperience working in 24/7 operational environmentsPrior experience coordinating shifts or leading small technical teamsdeep expertise in secure cloud operations, Kubernetes platforms, and large-scale infrastructure engineering.Reference: AMC-AQU-COECPostcode: GL52#adquTPBN1_UKTJ