Description You have discovered the perfect setting to expand your skills and make a meaningful impact. Partner with an organization committed to defining the future of site reliability in the financial sector. As a Director of Site Reliability Engineering at JPMorgan Chase within the Chief Technology Office Global Technology Asset Management (CTO-GTAM) team, you are constantly establishing new collaborative partnerships that allow your team to work across functions. Proactively engage team members, initiate career conversations, and delegate assignments and opportunities equitably. Job responsibilities Collaborates with engineering, support, and operations teams to maintain and improve the reliability of mission-critical applications. Participates in incident management, troubleshooting, and continuous improvement initiatives. Implements automation and monitoring solutions to enhance system reliability. Joins an on-call rotation and respond effectively to production incidents. Shares knowledge and follow best practices to foster a culture of learning and innovation. Communicates clearly with stakeholders and proactively solve problems. Focuses on customer needs and deliver high-quality support. Documents solutions and incident responses for future reference. Analyzes system performance and recommend improvements. Contributes to post-incident reviews and drive process enhancements. Supports the integration of new tools and technologies to improve operational efficiency. Required qualifications, capabilities, and skills Formal training or certification on SRE and Application Support concepts and expert applied experience Demonstrable experience in SRE, DevOps, or application support roles, including knowledge of SLIs, SLOs, incident response, and troubleshooting. Experience utilizing monitoring and observability tools such as Grafana, Prometheus, Splunk, and Open Telemetry. Hands-on experience with CI/CD pipelines (Jenkins, including global libraries), infrastructure as code (Terraform), version control (Git), containerization (Docker), and orchestration (Kubernetes). Experience with cloud platforms such as AWS, GCP, or Azure, and automate infrastructure and deployments. Able to break down complex issues, document solutions, and communicate effectively with team members and customers. Implemented automation and monitoring solutions to support operational goals. Experience collaborating with cross-functional teams to resolve incidents and improve reliability. Contributed to continuous improvement of support processes and system performance. Preferred qualifications, capabilities, and skills Deep experience in building enterprise software and proficiency in multiple languages preferably Java, Python, Shell scripting Demonstrates experience in banking, fintech, or regulated environments. Participates in resilience engineering activities such as game days or chaos engineering. Mentors peers by sharing knowledge and best practices. Contributes to the adoption of innovative tools and approaches in support operations Experience hiring, developing, and recognizing talent Draws upon leadership experience to engage team members to expresses complex ideas with appropriate level of detail