Social network you want to login/join with:
Job Overview
As a Principal Site Reliability Engineer at JPMorgan Chase within the Mergers & Acquisitions SRE team, you will work with stakeholders to define non-functional requirements (NFRs) and availability targets for services in your application and product lines. You will ensure these NFRs are incorporated into product design and testing, monitor customer experience through service level indicators, and establish service level objectives with stakeholders for production implementation.
Role Summary
We seek an experienced infrastructure engineer to join our Mergers & Acquisitions SRE team. Your expertise in Microsoft 365 and Google Workspace will facilitate the integration of email, chat, storage, and collaboration platforms during merger projects. You will ensure infrastructure resilience and scalability, collaborate with cross-functional teams, implement change management practices, and analyze data for strategic insights. Your technical and communication skills will be vital in cloud services integration, especially if you have experience in cloud migrations and a strong technical background.
Job Responsibilities
1. Create high-quality designs, roadmaps, and program charters, either personally or through guiding engineers.
2. Provide advice and mentorship to engineers; serve as a key resource for technical and business-related issues.
3. Demonstrate and promote site reliability principles daily within your team.
4. Collaborate to develop and implement observability and reliability designs for complex, robust, and stable systems.
5. Become an expert on applications and platforms within your remit, understanding their interdependencies and limitations.
6. Evolve and debug critical components of applications and platforms.
7. Design and implement cloud-based solutions leveraging Microsoft 365 and Google Workspace to ensure resiliency and scalability.
8. Utilize Microsoft tools such as PowerShell, Azure AD, Graph API, SQL Server, and Reporting Services for cloud service maintenance and enhancement.
9. Apply change management and agile practices for global tenant-wide changes, including testing, documentation, and guidance.
10. Acquire comprehensive knowledge of datasets for strategic deployments and tenant footprint analysis.
Required Qualifications and Skills
1. Formal training or certification in site reliability principles and advanced experience.
2. Advanced knowledge of observability tools like Grafana, Dynatrace, Prometheus, Datadog, Splunk, etc.
3. Deep understanding of software applications and technical processes in relevant disciplines.
4. Ability to communicate complex data solutions effectively.
5. Active contribution to the engineering community.
6. Experience evaluating vendor offerings and integrating them into strategy.
7. Ability to troubleshoot defects during testing phases.
8. Strong communication skills for mentoring and education on reliability principles.
9. Expertise in Microsoft OneDrive and SharePoint Online, including datasets, retention policies, and access controls.
10. Develop and execute data gathering and post-migration validation processes.
11. Proficiency in PowerShell, Azure AD, Graph API, SQL Server, and Reporting Services.
12. Experience assessing and mastering migration tools such as Quest on demand and AvePoint.
Preferred Qualifications and Skills
1. Understanding of change management and agile frameworks for global tenant-wide changes.
2. Excellent interpersonal and communication skills for effective collaboration across diverse teams including Product, Engineering, Legal, Cybersecurity, Operations, Vendors, and Acquired Companies.
#J-18808-Ljbffr