This Senior Software Engineer position within AI Operations focuses on designing, building, and maintaining complex systems that enable alert ingestion, correlation, automation, and operational intelligence at scale. The role is hands-on and technical, with opportunities to influence platform architecture and contribute to AI-adjacent projects across operational environments. Technologies include machine learning, data engineering, AI, and GENAI in supporting GST.
Responsibilities
* Work as a full-stack software engineer, coding, developing, and testing applications using Golang and Python across service, API, and automation layers.
* Lead the development and implementation of the AI Operations platform, driving AI-enabled software solutions for alert ingestion, correlation, automation, and operational workflows.
* Design, build, and operate software and automated testing frameworks on AWS, including unit tests, integration tests, API tests, and environment-based validation.
* Document and maintain design specifications, configuration, deployment processes, and troubleshooting runbooks.
* Train, mentor, and oversee development team members through code reviews, design walkthroughs, and technical guidance.
* Identify and implement solutions for critical challenges involving software and hardware interfaces, performance bottlenecks, and reliability issues.
* Deliver full-stack software development using continuous improvement techniques and automation, identifying root causes and resolving issues within distributed systems.
* Develop strong cross-functional relationships with leaders, product owners, engineers, programmers, and QA to contribute to planning, delivery, and operational support.
* Hands-on experience developing infrastructure-as-code (IaC) solutions in a data or platform engineering role, plus working with Application Performance Management for monitoring and diagnostics.
* Proficiency in programming languages such as Golang and Python.
* Proven expertise in building and scaling automated testing ecosystems on AWS, including unit, integration, and API testing.
* Hands-on experience with algorithms and artificial intelligence applied within software platforms or automation workflows.
* Strong knowledge of REST APIs, API design, versioning, and integration, with experience using low-code/no-code automation platforms.
* Experience working with MongoDB and Postgres, including data modelling, query optimization, and operational support.
* Proven experience creating and maintaining technical documentation, including design specifications, configuration guides, deployment processes, and troubleshooting documentation.
* Experience maintaining architecture artifacts such as sequence diagrams and software specification diagrams to support system understanding and change management.
* Ability to lead and introduce innovative design concepts, influencing development strategies, patterns, and engineering techniques.
* Proven ability to provide technical solutions, troubleshoot engineering automation issues, and stay current with emerging software technologies to design features that complement the platform.
Context
You'll join a globally distributed engineering team working across Sky and NBCUniversal platforms, supporting services used by millions of customers worldwide. The role sits at the intersection of software engineering, platform reliability, and AI-assisted operations. We build systems that help engineering and operations teams detect issues earlier, respond faster, and continuously improve platform resilience and operational outcomes.
Sky are developing a cutting-edge AI Operations platform that supports large-scale streaming, content delivery, and live services across the Sky and NBCUniversal ecosystems. The role focuses on complex systems and advancing toward AI-assisted operations.
#J-18808-Ljbffr