Senior Azure SaaS Reliability & Support Engineer - Hybrid (2 days a week in Kingston) - ASAP Start
You will be the bridge between support, engineering, and cloud operations
* Investigating and fixing complex application and infrastructure issues.
* Monitoring capacity, performance, and error budgets across all deployments.
* Designing automation and tooling to improve reliability and reduce manual work.
Your Responsibilities and Tasks
1. Environment Health & Incident Response
* Monitor ST and MT environments for server performance, response times, error rates, and application health.
* Detect and resolve database issues, stalled file processing, or misplaced storage objects.
* Use Azure diagnostics and telemetry to troubleshoot and resolve complex incidents.
* Provide third-line support for escalated customer cases, collaborating with development for code-level fixes.
2. Reliability Engineering (Fleet Level)
1. Maintain uptime, performance, and scalability across all ST and MT deployments.
2. Define and track service-level objectives (SLOs) and error budgets for different environment types.
3. Perform capacity planning for Servers, databases, and storage, scaling resources before issues occur.
4. Identify systemic patterns c...