Quality Engineer - AI Platform Engineering
Employment Type: Inside IR35 Contract
Duration: 6 Months
Rate: £650 - £750/ Day
Location: Central London, Hybrid (3 Days Onsite per Week)
Opportunity Overview
A global organisation is building a centralised AI enablement and platform engineering function focused on delivering scalable, secure, and governed AI capabilities across the enterprise.
This role sits within a programme delivering enterprise-grade agentic AI infrastructure, including internal AI assistants, retrieval and search services, extensibility frameworks, and governance tooling.
Programme Overview
The programme is focused on delivering a production-grade internal agentic AI platform, including:
Development of an enterprise AI assistant capable of reasoning, planning, and tool orchestration
Operation of enterprise retrieval, search, and grounding services for approved data sources
Delivery of a secure internal gateway layer providing discovery, observability, policy enforcement, and lifecycle management for AI-integrated services
Design and development of AI-integrated services and reusable capabilities that safely expose internal and third-party systems to AI agents
Establishment of evaluation, governance, and quality-control frameworks to support scalable and compliant deployment of AI capabilitiesThe programme currently follows a centrally delivered model while evolving towards a federated contribution approach over time.
Key Responsibilities
Define and implement evaluation frameworks covering correctness, safety, reliability, and regression impact for AI-integrated services
Develop and maintain automated test pipelines for agentic workflows, including tool orchestration and multi-step execution paths
Identify, evaluate, and mitigate AI system failure modes such as hallucinations, invalid inputs, latency issues, and inappropriate tool usage
Produce testing and governance evidence required for internal approval and operational processes
Collaborate closely with ML Engineers and platform teams to embed testability and evaluation capabilities into AI services
Contribute to the long-term quality assurance and governance strategy for enterprise-wide AI platform adoptionEssential Skills and Experience
Strong Python development experience, particularly for automation and test frameworks
Experience with LLM and RAG evaluation tooling, frameworks, or custom evaluation pipelines
Expertise in automated testing across unit, integration, and regression testing environments
Good understanding of agentic AI systems, associated risks, and operational failure modes
Ability to assess technical solutions against governance, audit, and security requirements
Experience working within regulated or highly governed engineering environmentsWhat's on Offer
Opportunity to contribute to large-scale enterprise AI transformation initiatives
Exposure to cutting-edge AI platform engineering and governance challenges
Collaborative environment working alongside platform engineers, ML specialists, and architecture teams
Influence over the development of long-term AI quality and governance standards
Opportunity to shape scalable AI engineering practices within a complex enterprise environment