Job Summary:
System Administrator is responsible for supporting day-to-day server operations, installations, maintenance, and troubleshooting within large-scale data center environments. This role involves managing hardware deployments, ensuring optimal server performance, and providing first-level technical support for server-related incidents.
Key Responsibilities:
Server Deployment & Capacity Planning
* Plan and organize server installations onto racks, ensuring proper mounting and arrangement.
* Develop and implement capacity management strategies for large-scale data centers to optimize resource utilization.
* Gather server requirements and ensure adequate provision of space, power, and rack capacity.
Server Installation & Configuration
* Install and configure operating systems for newly mounted servers.
* Reinstall server operating systems and resolve related abnormalities.
Maintenance & Troubleshooting
* Perform daily server maintenance, troubleshooting, and repair activities.
* Follow up on break-fix incidents and ensure timely resolution.
* Collaborate with remote vendors or other technical teams to resolve hardware batch failures and issues.
* Conduct server network troubleshooting and general system diagnostics.
Asset & Lifecycle Management
* Maintain accurate data on internal systems including asset management, ticketing, and rack-related records.
* Collect, verify, and monitor online asset status and issues.
* Manage server lifecycle activities including drive erasure, hardware retirement, and relocation.
* Submit and track part RMAs or media destruction requests.
Operations Support & Escalation
* Provide on-call support for issues raised by business stakeholders.
* Assist with retrofitting and testing feedback for tools, systems, and platforms.
* Escalate complex technical issues to Senior SOE engineers when required.
Qualifications & Requirements:
* Education: Bachelor’s Degree or Diploma in Computer Science, Electrical Engineering, or a related field.
Technical Skills:
* Strong understanding of server operations and hardware components.
* Proficiency with Linux systems for troubleshooting server software/hardware issues.
* Basic knowledge of network concepts including MAC, Subnet, and TCP/IP.
* Familiarity with out-of-band/lights-out server management (e.g., IPMI).
* Ability to read, understand, and run simple Shell/Bash scripts.
Soft Skills:
* Strong problem-solving and analytical abilities.
* Good communication skills in English and ability to work effectively in a team.
* High sense of responsibility, enthusiasm for technical challenges, and ability to work under pressure.
* Capable of working independently with minimal supervision.
Skillset Summary:
* Server Operations & Maintenance
* Linux OS Troubleshooting
* Hardware Installation & Lifecycle Management
* Network Fundamentals (TCP/IP, MAC, Subnet)
* Scripting (Shell/Bash)
* Asset Management & Ticketing Systems
* IPMI and Remote Server Management