Overview
About Scrumconnect Consulting: Scrumconnect Consulting is a multi-award-winning digital consultancy, recognised for delivering impactful technology solutions across UK government departments. Our work has positively influenced the lives of over 40 million UK citizens. With a strong commitment to user-centred design and agile delivery, and more to deliver innovative digital services that matter
Responsibilities
* Run, manage, and continuously evolve the AWS and secure on-premise environments to ensure availability.
* Lead Level 3 (L3) production support and non-production environment maintenance, including 24/7 on-call support.
* Ensure all services are compliant with security standards and support the change and release governance model.
* Build and maintain infrastructure components like event streaming (Kafka), databases (Aurora, RDS, Redis), identity management (Keycloak), and caching layers.
* Enhance and maintain CI/CD tooling and self-service developer pipelines for tenant teams.
* Proactively manage and resolve tech debt by working with central governance bodies and ensure visibility to the board.
* Increase automation, observability, and testing coverage across the platform components while enabling data-driven decision-making.
* Align delivery with the product roadmap, collaborating with internal/external platform and infrastructure teams to support scalable and resilient services.
* Support critical national infrastructure tasks including platform deployments, incident/problem/change management, and continual service improvement (ITIL-aligned).
* Use and integrate ServiceNow (or its successor) to track and manage changes, incidents, requests, and problem records.
* Support replication services, match engines, secure data flows (Impex, Threat, Kafka replication).
Preferred Tech Stack Expertise
* Cloud Infrastructure: AWS (EKS, RDS, Aurora, ElastiCache, Kafka, IAM)
* Secure Hosting: Experience working with air-gapped or government-secure environments
* Container & Cluster Management: Docker, Kubernetes, Rancher, Jenkins, Helm
* Monitoring & Observability: Prometheus, Grafana, ELK Stack, Dynatrace
* Secrets & Identity Management: HashiCorp Vault, Keycloak
* CI/CD & DevOps Tooling: Jenkins, Git, ServiceNow, Trivy, Terraform
* Streaming & Messaging: Apache Kafka (including Kafka Replication)
* Data Layers: PostgreSQL, Redis, RDLs
* Automation: IaC, pipeline build automation, event relay tooling
* Scripting: Bash, Python, Groovy, Lambda functions
#J-18808-Ljbffr