As a Principal Site Reliability Engineer at JPMorgan Chase within the Mergers & Acquisitions SRE team, you will work with your fellow stakeholders to define non-functional requirements (NFRs) and availability targets for the services in your application and product lines. You will ensure those NFRs are accounted for in your products’ design and test phases, that your service level indicators are effectively measuring customer experience, and that service level objectives are defined with stakeholders and implemented in production.
We are looking for an experienced infrastructure engineer to join our mergers & acquisitions SRE team! Your knowledge of Microsoft 365 and Google Workspace will help integrate email, chat, storage, and collaboration platforms services during merger projects. You'll also ensure our infrastructure's resilience and scalability, impacting our operations directly. Collaborate with cross-functional teams, implement change management practices, and analyze data for strategic insights. With your technical and communication skills, you'll be vital in cloud services integration. If you have experience in cloud migrations and a strong technical background, apply to help advance our operational excellence.
Job responsibilities
1. Creates high quality designs, roadmaps, and program charters that are delivered by you or the engineers under your guidance
2. Provides advice and mentoring to other engineers and acts as a key resource for technologists seeking advice on technical and business-related issues
3. Demonstrates site reliability principles and practices every day and champions the adoption of site reliability throughout your team
4. Collaborates with others to create and implement observability and reliability designs for complex systems that are robust, stable, and do not incur additional toil or technical debt
5. Works toward becoming an expert on the applications and platforms in your remit while understanding their interdependencies and limitations
6. Evolves and debug critical components of applications and platforms
7. Design and implement cloud-based solutions leveraging Microsoft 365 & Google Workspace offerings to ensure the resiliency and scalability of cloud infrastructure.
8. Utilize Microsoft tools such as PowerShell, Azure AD, Graph API, Microsoft SQL Server, and SQL Reporting Services to maintain and enhance cloud services.
9. Apply change management practices and agile frameworks to implement account level changes, providing testing, documentation, and guidance to operations for global tenant-wide changes.
10. Acquire comprehensive knowledge of datasets required for a holistic view of tenant footprint to assist in strategic deployments.
Required qualifications, capabilities, and skills
11. Formal training or certification on site reliability culture and principles concepts and proficient advanced experience
12. Advanced knowledge and experience in observability such as white and black box monitoring, service level objectives, alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, etc.
13. Advanced knowledge of software applications and technical processes with considerable depth in one or more technical disciplines
14. Ability to communicate data-based solutions with complex reporting and visualization methods
15. Recognized as an active contributor of the engineering community
16. Continues to expand network and leads evaluation sessions with vendors to see how offerings can fit into the firm’s strategy
17. Ability to anticipate, identify, and troubleshoot defects found during testing
18. Strong communication skills with ability to mentor and educate others on site reliability principles and practices
19. Apply technical expertise in Microsoft OneDrive & SharePoint Online to devise strategies, identify potential obstacles and setting LOB expectations for merger projects.
20. Comprehensive knowledge of datasets required for a holistic view of the Microsoft OneDrive & SharePoint Online footprint, including Private/Public Sites, access groups, retention policies and more.
21. Develop, construct, and execute a consistent data gathering process that will be used to assess scope, establish proper data ownership, and perform post migration validation.
22. Proficiency in using Microsoft tools, such as PowerShell, Azure AD, Graph API, Microsoft SQL Server, SQL Reporting Services.
23. Capable of assessing, documenting, and mastering migration tools, such as Quest on demand and AvePoint.
Preferred qualifications, capabilities, and skills
24. Understanding of change management practices and agile frameworks for implementing account level changes while providing testing, documentation, and guidance to operations for global tenant wide changes.
25. Demonstrate excellent interpersonal and communication skills, and the ability to collaborate effectively with diverse team: Product, Engineering, Legal, Cyber security, operations, vendors, and members of the acquired company.