Vice President, Site Reliability Engineer - Pipeline
Join to apply for the Vice President, Site Reliability Engineer - Pipeline role at BNY
We’re seeking a future team member for the role of Vice President, Site Reliability Engineer to join our team. This role is based in Manchester.
This is a Pipeline req, created in anticipation of potential future needs in BNY. We welcome you to apply! When applying to this general posting, our expert BNY Talent Acquisition Team may also review your resume for consideration across other open roles within the company.
In this role, you’ll make an impact in the following ways
* Drive reliability and performance by defining SLOs/SLIs, improving observability, and proactively identifying and addressing system bottlenecks across cloud environments.
* Automate infrastructure and operations using Terraform, Kubernetes, and CI/CD tools to eliminate toil and enable scalable, fault-tolerant deployments.
* Collaborate cross-functionally with product, infrastructure, and DevOps teams to reduce incidents, build resilient services, and ensure architectural clarity.
* Lead incident management by participating in on-call rotations, conducting postmortems, and implementing automated recovery to minimize downtime.
* Build and maintain monitoring systems with tools like Prometheus, Grafana, AppDynamics, and Splunk to support real-time alerting and root cause analysis.
* Develop platform tooling and pipelines for container orchestration, third-party integrations, and cloud-native operations to improve system efficiency and reliability.
* Mentor engineers and champion SRE best practices, embedding a reliability-first culture and ensuring technical excellence across engineering teams.
To be successful in this role, we’re seeking the following
* Strong expertise in cloud infrastructure (Azure, AWS, or GCP), containerization (Docker, Kubernetes), and Infrastructure as Code (Terraform, Helm).
* Proficiency in observability and monitoring tools such as Prometheus, Grafana, AppDynamics, Datadog, Splunk, and experience with incident response and on-call support.
* Solid programming and scripting skills in languages like Python, Go, or Java, with a focus on automation, tooling, and system integration.
* Deep understanding of SRE principles, including SLAs, SLOs, error budgets, postmortems, and reliability-focused system design.
* Strong collaboration and communication skills, with experience working in Agile environments and partnering with cross-functional engineering, product, and operations teams.
BNY is an Equal Employment Opportunity/Affirmative Action Employer - Underrepresented racial and ethnic groups/Females/Individuals with Disabilities/Protected Veterans.
This is a Pipeline req, created in anticipation of potential future needs in BNY.
#J-18808-Ljbffr