Senior Infrastructure Engineer
Reports to: Head of Infrastructure and Security
Location:
* Hybrid with a requirement to be in the office on average once a week
* Harwell Campus, Near Didcot, Oxfordshire, UK
Job Brief
This is a critical role to support the design, implementation, and maintenance of our AWS-based infrastructure and internal IT systems. This role blends technical execution, stakeholder engagement, and operational leadership.
You'll work closely with the Head of Infrastructure and Security and will be empowered to lead initiatives, cover leadership responsibilities in their absence, and act as a trusted technical liaison with stakeholders across the business.
The role is a technical liaison between development teams and the Infrastructure and Security team focusing on understanding technical requirements and the implementation of key systems and processes used to underpin the delivery of our software services
Key Responsibilities
Operational Execution
* Design, build, and maintain secure, scalable AWS infrastructure using best practices
* Implement and manage Infrastructure as Code using Terraform
* Manage CI/CD pipelines, deployment automation, and observability tooling
* Participate in on-call rotation and incident response, leading investigations and resolutions as needed
* Assist with the deployment, scaling, and management of containerized applications in Kubernetes clusters
* Troubleshoot issues within Kubernetes environments, including pod failures, networking, and storage problems
* Patching and maintenance of Kubernetes clusters
* Monitoring and alerting using tools like Prometheus, OpenTelemetry and Grafana
* Support internal IT functions including endpoint provisioning, device management (MDM), and access controls
Incident Response & Troubleshooting:
* Actively participate in incident response for cloud infrastructure and mobile device-related issues, ensuring timely resolution and minimizing system downtime
* Collaborate with senior engineers to investigate root causes of incidents and implement preventive measures
* Implementation and support of a centralised logging solution for troubleshooting and incident resolution
Security Operations
* Maintain and improve security posture across cloud and internal systems
* Implement IAM policies, encryption standards, vulnerability management, and monitoring
* Meet compliance requirements through documentation and operational controls
* Support DevSecOps initiatives and integrate security controls into CI/CD pipelines
IT Operations & Endpoint Management
* Oversee the provisioning, configuration, and lifecycle management of employee devices (laptops, mobile devices, etc.)
* Implement and manage endpoint protection, MDM (Mobile Device Management), and patching strategies
* Ensure secure access to corporate tools and systems, with appropriate controls in place
* Manage user identity and access across systems (e.g., SSO, MFA, directory services)
Stakeholder Engagement & Technical Discovery
* A key point of contact for cross-functional stakeholders to gather, clarify, understand and translate technical infrastructure and security requirements
* Planning and scoping of infrastructure improvements or migrations based on stakeholder feedback
Leadership
* Mentor junior team members and contribute to knowledge-sharing and process documentation
* Provide leadership cover for the Head of Infrastructure and Security as required, including participation in planning meetings, decision-making, and communication with leadership
* Provide operational support for the Head of Infrastructure and Security when they are unavailable, ensuring continuity of engineering operations and technical support
Working (& Desirable) Technology Stack
* Experience in cloud infrastructure, DevOps, or platform engineering with security responsibilities
* Experience of using the AWS Well Architected Framework to implement solutions
* Understanding of Prometheus, Opensearch and Grafana for logging and alerting
* Containerisation Orchestration with Docker and Kubernetes
* Hands-on experience with Infrastructure as Code (Terraform)
* Understanding of architectural principles of building with Cloud Native Technologies
* Knowledge of security frameworks
* Understanding of ITIL frameworks and incident management
The following technology experience is desired but not required
* PostgreSQL
* Python (particularly boto3) and bash for automation
Personal Skills and Experience
* Relevant professional level qualification or experience
* Experience as a technically involved DevOps Engineer
* Analytical, organised, and effective approach to setting priorities
* Positive and approachable demeanour
* Proactive approach to problem solving
* Experience building highly automated infrastructures
* Awareness of DevOps and Agile principles
* Proven ability to cascade information and coach users in Cloud Architecture
* Candidate will be expected to participate in the Architecture Guild
Candidates should be able to demonstrate good levels of
* Problem-solving
* How a logging solutions can be used for alerting and incident resolution
* Teamwork
* Composure under pressure
* Written communication
* Verbal communication
* Mentoring more junior members of staff