Site Reliability Engineer #2494
Our partner, an innovative PaaS company specializing in remote monitoring and network management solutions, is looking for a Site Reliability Engineer to help ensure the critical infrastructure and applications' reliability, scalability, and performance. You’ll collaborate closely with development, DevOps, and other teams to maintain high uptime, security, and user experience standards for millions of endpoints.
Bachelor's or higher degree in Computer Science, Information Systems, Information Technology, or a related technical field/experience.
~7+ years of experience in Site Reliability Engineering, DevOps, Infrastructure, or related roles.
~ Deep understanding of AWS and its various modules and services.
~ Strong background in Linux administration and troubleshooting.
~ Proven experience in monitoring and observability tools to proactively manage system health.
AWS (Amazon Web Services)
Scripting (Ansible, Bash, Python, GO)
Implement advanced AWS features (Route53, ALB/NLB, multi-region setups) to ensure global reliability.
Maintain and optimize the existing CI/CD pipelines and deployment processes to streamline software delivery, reduce risks, and ensure seamless integration of new features.
Collaborate with Development, QA, and DevOps teams to integrate best practices into build and release processes.
Implement, manage, and enhance monitoring tools to proactively detect and resolve system issues.
Administer and optimize Linux-based servers and applications, ensuring stability, performance, and security.
Implement security best practices across AWS environments, ensuring compliance with industry standards and safeguarding cloud infrastructure.
Diagnose and resolve infrastructure, networking, and application-related performance issues to ensure operational efficiency.
Identify, diagnose, and resolve infrastructure or application performance bottlenecks.
Create real-time monitoring dashboards and alerting systems to track system health, capacity, and performance trends.
Work closely with development teams to fine-tune infrastructure for cost efficiency while maintaining high performance.