Head of support & service reliability engineering

Guildford

Sycurio

Engineering

Posted: 22 May

Offer description

Head of Support & Service Reliability Engineering

We are seeking a Head of Support & Service Reliability to lead and evolve our global support function into a proactive, platform-integrated reliability capability.

This role provides an exciting and dynamic opportunity for an outcome focused individual; as Sycurio is in a critical inflection point as we transition from a single-tenant architecture to a multi-tenant SaaS platform, requiring a fundamental shift from reactive ticket handling to systemic reliability, observability, and customer experience management at scale.

You will own the end-to-end operational integrity of the platform, ensuring availability, performance, and customer trust, while partnering closely with Engineering, Product, and Customer-facing teams; being a key contributor to our GRR goal of 90%+

Sycurio employs a strategic managed service provider who provides the people, tooling, and day-to-day execution across all support tiers. The Head of Support sets the standards, governs vendor performance, and ensures every aspect of the support experience — from incident response to customer satisfaction — meets enterprise-grade expectations

Key Responsibilities:

* Service Reliability & Platform Stability

* Own platform availability, performance, and reliability across all tenants

* Reduce incident frequency, severity, and blast radius

* Establish and drive Service Reliability Engineering (SRE) principles

* Ensure scalability and operational readiness of a multi-tenant platform

* Incident Management & Response

* Implement and lead a structured incident management framework (P1–P4)

* Act as executive owner of major incidents (P1/P2)

* Drive improvements in:

* Mean Time to Detect (MTTD)

* Mean Time to Resolve (MTTR)

* Ensure clear, consistent internal and external communication during incidents

* Observability & Monitoring

* Define and implement a comprehensive observability strategy, including:

* Technical telemetry (infrastructure, application, APIs)

* Business telemetry (transactions, payment success rates, usage)

* End-to-end customer journey visibility

* Ensure issues are detected proactively, not customer-reported

* Partner with Product and Engineering to embed telemetry into the platform

* Support Operations (L1–L3)

* Lead global support teams ensuring high-quality, SLA-driven case management

* Define and enforce support processes, tooling, and performance standards

* Improve key metrics:

* First response time

* Resolution time

* Reopen rate

* Escalation quality

* Platform Operations & Change Management

* Oversee operational aspects of the platform, including:

* Release management and deployment safety, ensuring all releases are observable, reversible, and low-risk

* Change control processes

* Environment consistency across staging and production

* Own the visibility and continuous improvement of delivery and recovery performance using the DORA metrics, in partnership with Engineering

* Issue Management & Root Cause Discipline

* Establish rigorous Root Cause Analysis (RCA) standards

* Identify and eliminate systemic issues (not just symptom fixes)

* Track and reduce recurring incidents

* Feed insights into Product and Engineering roadmaps

* Customer Experience & Commercial Alignment

* Align support with Customer Success and Sales

* Ensure coordinated communication during incidents

* Protect customer relationships during critical events

* Introduce tenant-aware impact assessment (ARR, strategic accounts, regulatory exposure)

* Support enterprise-grade expectations for transparency and reliability

* Cross-functional Leadership

* Act as the bridge between:

* Engineering

* Product

* Customer Delivery / Success

* Embed supportability and operational readiness into:

* Pre-sales (Stage 4/5 governance)

* Product development

* Deployment processes

* Managed Service Governance

* Chair regular operational reviews and quarterly business reviews with the managed service leadership team

* Own the managed service scorecard — defining KPIs, reviewing performance data, and driving accountability for misses

* Manage contract compliance, SLA adherence, and commercial exposure from managed service underperformance

* Lead continuous improvement programs jointly with the managed service provider, including tooling upgrades, process redesigns, and training investments

* Maintain an escalation path for systemic or persistent managed service failure, up to and including remediation planning

Key qualifications, skills, experience:

Required

* 10+ years in Support, Platform Operations, or SRE leadership roles

* Proven experience in multi-tenant SaaS and legacy environments

* Strong understanding of:

* Distributed systems

* Incident management at scale

* Observability frameworks

* Track record of building and scaling high-performing operational teams

* Experience in outsourced or hybrid operational models

* Experience working cross-functionally with Engineering and Product

Preferred

* Background in payments, security, or compliance-driven environments (e.g., PCI)

* Experience with API-first platforms and telephony/payment flows

* Familiarity with observability tools (e.g., Grafana, etc.)

PI6555f5205be5-30511-40604515

Apply

Create E-mail Alert

Save

Similar job

Engineering manager – electrical systems

Farnborough (Hampshire)

Bae Systems

Engineering manager

Similar job

Sub-agent

Isleworth

Kier

Sub agent

Similar job

Cycle technician

Farnham

Halfords

Technician

£11.25 - £13.11 an hour