About the Company
Our client is a leading global provider of communication and collaboration solutions, including cloud-based video conferencing and IP voice communication products. The company works with international partners and supports customers across multiple regions.
About the Role
We are looking for a Site Reliability Engineer (SRE) to support the operation and reliability of overseas cloud services. You will help ensure platform stability, handle incidents, support service requests and changes, and drive automation and continuous improvements to enhance service availability and performance.
Key Responsibilities
* Operate and maintain overseas cloud services to ensure stable and reliable platform performance.
* Monitor system health, identify performance bottlenecks, and implement improvements.
* Manage operational activities including incident management, service request handling, problem management, and change management.
* Perform software updates and deployments, and maintain core platform systems.
* Respond to major and minor service disruptions, restore services, and conduct root cause analysis (RCA).
* Develop and maintain automation tools/scripts to improve operational efficiency and reduce manual work.
* Maintain clear documentation such as runbooks, SOPs, and technical procedures.
* Participate in an on-call support roster to ensure timely response to production issues.
Requirements
* Degree in Computer Science, Information Technology, Engineering, or a related discipline (or equivalent practical experience).
* At least 2 years of relevant experience in SRE / DevOps / Cloud Operations / Platform Engineering or related roles.
* Strong knowledge of Linux system administration and troubleshooting.
* Hands-on experience with containers and Kubernetes.
* Familiarity with common automation and configuration tools such as Ansible.
* Experience with at least one scripting language such as Python and/or Shell.
* Exposure to public cloud environments such as AWS and/or Azure is an advantage.
* Good communication and stakeholder management skills, with the ability to work across teams.
* Strong problem-solving skills and ability to work effectively in a production support environment.
By applying to this job advertisement or responding to our messages, you acknowledge and consent to being contacted by PDR Group and its representatives through various communication channels (including but not limited to phone calls, emails, text messages, and messaging platforms such as WhatsApp or LinkedIn) regarding current and future job opportunities. You may withdraw your consent at any time by notifying us in writing.
EA License Number: 24C2623 | R1216562
#J-18808-Ljbffr