Social network you want to login/join with:
col-narrow-left
Client:
Ranger Technical Resources
Location:
Job Category:
Other
-
EU work permit required:
Yes
col-narrow-right
Job Views:
3
Posted:
31.05.2025
Expiry Date:
15.07.2025
col-wide
Job Description:
Site Reliability Engineer #2494
Position Summary:
Our partner, an innovative PaaS company specializing in remote monitoring and network management solutions, is looking for a Site Reliability Engineer to help ensure the reliability, scalability, and performance of critical infrastructure and applications. You will build and maintain highly available systems, support and optimize CI/CD pipelines, and identify optimal solutions for our products. Collaboration with development, DevOps, and other teams is essential to maintain high uptime, security, and user experience standards for millions of endpoints.
Experience and Education:
* Bachelor's or higher degree in Computer Science, Information Systems, Information Technology, or a related field or equivalent experience.
* 7+ years of experience in Site Reliability Engineering, DevOps, Infrastructure, or related roles.
* Deep understanding of AWS services and modules.
* Strong Linux administration and troubleshooting skills.
* Experience with CI/CD pipelines and Infrastructure as Code (IaC).
* Experience with monitoring and observability tools like New Relic, DataDog, or Splunk.
Skills and Strengths:
* Amazon Web Services (AWS)
* Auto Scaling, Fargate, Route53
* Observability tools (New Relic, DataDog, Splunk)
* Scripting (Ansible, Bash, Python, Go)
* CI/CD processes
Primary Job Responsibilities:
* Design and support high-availability AWS environments (EC2, ECS, EKS, Fargate).
* Implement AWS features like Route53, ALB/NLB, multi-region setups for reliability.
* Maintain and optimize CI/CD pipelines for efficient software deployment.
* Collaborate with teams to integrate best practices into build and release processes.
* Implement and enhance monitoring tools for proactive system management.
* Administer and optimize Linux servers for stability and security.
* Implement containerization to improve scalability.
* Apply security best practices and ensure compliance.
* Develop automated incident response and self-healing solutions.
* Diagnose performance issues across infrastructure, network, and applications.
* Design backup, failover, and disaster recovery strategies.
* Create dashboards and alerting systems for system health monitoring.
* Optimize infrastructure cost-effectively without compromising performance.
#J-18808-Ljbffr