Job Description
Overview
We are seeking a highly experienced Senior Network SRE with deep expertise across multi-vendor network infrastructure, automation, and reliability engineering. The ideal candidate will possess strong technical leadership, hands-on engineering capabilities, and a passion for building resilient, scalable, and observable network environments.
Key Responsibilities
1. Design, implement, and maintain highly available network solutions across routing, switching, firewalling, and wireless technologies.
2. Apply SRE principles to improve network reliability, scalability, and performance.
3. Develop and maintain automation workflows using Ansible, Salt, and related frameworks to reduce operational toil.
4. Build and operate monitoring, alerting, and observability dashboards using tools such as Grafana and Splunk.
5. Proactively identify network bottlenecks, performance issues, and reliability risks, implementing long-term fixes rather than reactive solutions.
6. Support incident response, root cause analysis, and post-incident reviews with a focus on continuous improvement.
7. Collaborate with cross-functional engineering, security, and operations teams to ensure network solutions meet business and technical requirements.
8. Contribute to documentation, runbooks, design artifacts, and operational standards.
9. Participate in capacity planning, network modernization initiatives, and automation-first strategies.
Required Skills & Experience
10. 10+ years of hands-on experience in enterprise or service provider network engineering.
11. Expertise in multi-vendor routing, switching, firewalling, and wireless technologies.
12. Deep understanding of network protocols (BGP, OSPF, EIGRP, STP, VXLAN, VPNs, QoS, MPLS, etc.).
13. Strong experience with infrastructure automation using Ansible and Salt.
14. Proficiency with observability tooling such as Grafana, Splunk, or equivalent.
15. Solid understanding of SRE practices including SLIs, SLOs, error budgets, and proactive reliability engineering.
16. Strong troubleshooting, analytical, and performance optimization skills.
17. Excellent communication and collaboration skills, with the ability to influence and guide technical stakeholders.
Nice to Have
18. Experience with network programmability (Python, API-driven networking, NetConf/RESTConf).
19. Exposure to cloud networking (AWS, Azure, GCP).
20. Knowledge of zero-trust, SD-WAN, and network security best practices.
21. Experience creating self-healing or fully automated network workflows.