Wise is a global technology company building the best way to move and manage the world’s money.
Job Description
We’re looking for a Data Science Lead to join our Contact Automation team in London.
About the Role
In the Support squad we aim to create a system that can power an automated “Wise Assistant” system that can answer most customer questions within the chat interface effectively, and support our agents in answering more complex questions. We need to apply this system effectively at scale across the majority of our contacts working seamlessly within the chat interface.
How You’ll Be Contributing
Evaluation & Experimentation (Core Focus)
* Own the design and evolution of evaluation frameworks for LLM-based systems (both offline and online)
* Build and scale A/B testing to measure real customer impact and guide product decisions
* Define meaningful metrics that connect system performance to customer outcomes and business impact
System Performance & Reliability
* Identify failure modes across the assistant (reasoning, retrieval, tool use, tone, etc.) and drive improvements
* Work closely with engineers to iterate on agentic workflows.
* Continuously improve system behaviour through prompt design, evaluation insights, and data-driven iteration
* Ensure the system performs reliably across multiple languages and geographies
Opportunity Identification
* Analyse conversation data to uncover new opportunities for improving the customer experience.
* Proactively shape what we should build next.
* Bring clarity to ambiguous problem spaces and turn them into actionable changes to the system
Cross-functional Collaboration
* Work daily with Product, Engineering, and Content/Design to shape the assistant’s behaviour
* Influence decisions about the product through clear communication, data visualisations, and well-structured impactful proposals.
* Ensure the system is not only accurate, but intuitive and trustworthy from a customer perspective.
How You’ll Work
* You will operate with a high degree of autonomy in a fast-moving environment (often working on 1–2 week iteration cycles) in close collaboration with engineering, content and product.
* Use feedback loops—both human and system-generated—to continuously improve performance
* Take ownership beyond your immediate scope when needed to move the project forward
Qualifications
* Strong sense of ownership; you drive work from idea to production and are motivated by building real systems used by customers.
* Good communicator, you are able to present ideas and foster discussion with both technical and non-technical audiences
* Strong grounding in statistics, with the ability to design, run and own A/B testing and experimentation end-to-end.
* Strong understanding of how to evaluate models (classification, regression, or LLMs) and why metrics matter and a proven ability to connect model/system performance to business outcomes.
* A strong product mindset and ability to work collaboratively in a cross-functional environment.
* Experience writing production code in at least one of Python, TypeScript or Java.
* Familiarity with modern LLM/agentic architectures (e.g. Mastra, LangGraph or similar frameworks)
* Strong data skills (SQL, Snowflake, data visualisation).
Skills
Some extra skills that are great (but not essential).
* Experience with a customer-facing agentic system.
* Experience with Bayesian inference, especially with A/B testing.
Additional Information
We are committed to diversity and inclusion. If you’re passionate about learning and want to join our mission, we’d like to hear from you.
#J-18808-Ljbffr