Jobs
My ads
My job alerts
Sign in
Find a job Career Tips Companies
Find

Site reliability engineer

Site reliability engineer
Posted: 12 December
Offer description

Description We are seeking an experienced Site Reliability Engineer with strong Linux troubleshooting skills and deep knowledge of virtual cloud networks and access technologies. The ideal candidate will have proven experience resolving complex issues across large-scale network infrastructure and cloud services in real time. Responsibilities include diagnosing and resolving production incidents, writing Python and Bash scripts on the fly to support live troubleshooting and automation, and maintaining operational reliability across cloud networking environments. Candidates should have hands-on expertise with remote access technologies such as FastConnect, IPsec, and BGP for secure and scalable route distribution. A strong understanding of Linux system processes, memory utilisation, disk and log management, network functionality, containerisation, and the TCP/IP stack is essential. The role involves triaging and resolving Severity 1 and 2 incidents using logs, metrics, and CLI tools under pressure, including failed changes or system and process failures that directly impact customers in a 24/7 operational environment. Responsibilities Work with the Virtual Networking team to share full-stack ownership of a collection of services and technology areas, providing operational support as part of an on-call rotation. Understand the end-to-end configuration, technical dependencies, and overall behavioural characteristics of production services. Take responsibility for the delivery of the mission-critical stack with a strong focus on security, resiliency, scalability, and performance. Hold authority for end-to-end performance and operability. Partner with global development teams to define and implement improvements in service architecture. Clearly articulate the technical characteristics of services and technology areas, guiding development teams to engineer and deliver premier capabilities within the Oracle Cloud service portfolio. Develop and communicate a clear understanding of the scale, capacity, security, and performance attributes and requirements of the service and technology stack. Demonstrate a solid grasp of automation and orchestration principles. Act as the ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs). Apply a deep understanding of service topologies and their dependencies to troubleshoot issues and define mitigations. Understand and explain the impact of product architecture decisions on distributed systems. Exhibit professional curiosity and a desire to develop a deep technical understanding of services and technologies. Ensure high quality, accurate and timely technical documentation of incidents, problems, changes, and standard operating procedures is maintained using tools such as Jira and Confluence Work is non-routine and highly complex, involving the application of advanced technical and business skills within the Virtual Networking specialisation of Oracle Cloud Infrastructure (OCI). Qualifications Strong understanding of virtual network architecture, security, and automation Understanding of TCP/IP stack and routing concepts in Linux systems and networking environments. IPSEC, VPNs and BGP specifically. Experience with containerisation technologies and orchestration platforms. Solid understanding of Virtual Cloud Networks (VCNs) in public cloud environments. Experience with CI/CD systems and release automation tools. Experience in scripting languages such as Python or Shell. Familiarity with infrastructure automation tools such as Terraform and Chef. Possess leadership experience to ensure appropriate changes, upgrades, and enhancements are made based on the technical analysis. Must support network segmentation (e.g., security lists, network security groups, or firewalls). Deep Understanding of manipulating telemetry data (traffic flows, health status) using Grafana dashboards and MQL. Experience with major public cloud providers (e.g., Oracle Cloud Infrastructure OCI, or equivalent). Experience using Jira and Confluence for incident tracking, knowledge management, and ongoing technical documentation. Qualifications Career Level - IC3

Apply
Create E-mail Alert
Job alert activated
Saved
Save
Similar job
Site reliability engineer - azure
Arrow Electronics, Inc.
Site reliability engineer
Similar job
Site reliability engineer
Wigan
Searchability (Uk)
Site reliability engineer
£70,000 a year
Similar job
Site reliability engineer - dv
Cheltenham
Hays Construction And Property
Site reliability engineer
See more jobs
Similar jobs
Home > Jobs > Engineering jobs > Site reliability engineer jobs > Site Reliability Engineer

About Jobijoba

  • Career Advice
  • Company Reviews

Search for jobs

  • Jobs by Job Title
  • Jobs by Industry
  • Jobs by Company
  • Jobs by Location
  • Jobs by Keywords

Contact / Partnership

  • Contact
  • Publish your job offers on Jobijoba

Legal notice - Terms of Service - Privacy Policy - Manage my cookies - Accessibility: Not compliant

© 2025 Jobijoba - All Rights Reserved

Apply
Create E-mail Alert
Job alert activated
Saved
Save