Social network you want to login/join with:
Site Reliability Engineer, Warrington, Cheshire
Client: Ranger Technical Resources
Location: Warrington, Cheshire
Job Category: Other
EU work permit required: Yes
Job Views: 3
Posted: 31.05.2025
Expiry Date: 15.07.2025
Job Description:
Site Reliability Engineer #2494
Position Summary:
Our partner, an innovative PaaS company specializing in remote monitoring and network management solutions, is seeking a Site Reliability Engineer to ensure the reliability, scalability, and performance of critical infrastructure and applications. The role involves building and maintaining highly available systems, supporting CI/CD pipelines, and optimizing solutions for the company’s products. Collaboration with development, DevOps, and other teams is essential to maintain high uptime, security, and user experience standards for millions of endpoints.
Experience and Education:
* Bachelor's or higher degree in Computer Science, Information Systems, Information Technology, or a related field/experience.
* 7+ years of experience in Site Reliability Engineering, DevOps, Infrastructure, or related roles.
* Deep understanding of AWS and its services.
* Strong Linux administration and troubleshooting skills.
* Experience with CI/CD pipelines and Infrastructure as Code (IaC).
* Proficiency with monitoring and observability tools (e.g., New Relic, DataDog, Splunk).
Skills and Strengths:
* AWS (Amazon Web Services)
* Auto Scaling
* Fargate
* Route53
* Observability tools (New Relic, DataDog, Splunk)
* Scripting (Ansible, Bash, Python, GO)
* CI/CD processes
Primary Job Responsibilities:
* Design and support EC2/ECS/EKS/Fargate environments for high availability.
* Implement AWS features (Route53, ALB/NLB, multi-region setups) for global reliability.
* Maintain and optimize CI/CD pipelines for efficient software delivery.
* Collaborate with development, QA, and DevOps teams to embed best practices.
* Manage monitoring tools to detect and resolve system issues proactively.
* Administer Linux servers and applications, ensuring stability and security.
* Implement containerization solutions for scalability and efficiency.
* Apply security best practices across AWS environments.
* Develop automated incident response and self-healing mechanisms.
* Diagnose and resolve infrastructure and performance issues.
* Design backup, failover, and disaster recovery strategies.
* Create monitoring dashboards and alerting systems.
* Work with teams to optimize infrastructure cost-effectively.
#J-18808-Ljbffr