Core Duties Design and develop machine learning models for traditional ML use cases (forecasting, classification, anomaly detection) and GenAI/LLM applications Lead experimentation cycles: define hypotheses, design experiments, evaluate results, and iterate rapidly while adhering to governance requirements Transition validated experiments into production-ready solutions, working closely with other engineers on deployment and monitoring Build and optimise ML pipelines using AWS services and experiment tracking tools Develop and integrate LLM-powered solutions for tracing, evaluation, and production monitoring Implement robust experiment tracking, model versioning, and reproducibility practices with full audit trails Design feature engineering approaches and contribute to feature store development Support production models through monitoring, performance analysis, and continuous improvement Apply responsible AI practices, including model explainability and fairness assessment Present experiment findings and production outcomes to stakeholders, articulating operational and strategic value Mentor junior colleagues and share learnings across the team About You You will have experience in many of the following: Hands-on experience developing and deploying ML models in Python using frameworks such as scikit-learn, XGBoost, PyTorch, or TensorFlow Strong experience with AWS ML services (SageMaker, Lambda, S3) in production environments Strong experiment design skills: hypothesis formulation, A/B testing methodology, and statistical evaluation Proven track record transitioning models from experimentation to production with appropriate governance and quality controls Experience with experiment tracking and MLOps tooling (MLflow, Weights & Biases, Data Version Control) Experience developing LLM/GenAI applications, including prompt engineering and RAG architectures It Would Be Great If You Also Had Experience In Some Of These, But If Not Well Help You With Them Experience with advanced LLM techniques: agents, tool use, and agentic workflows Experience with vector databases (Pinecone, Weaviate, pgvector) for RAG applications Experience with feature stores (Feast, AWS Feature Store) Experience with containerisation (Docker) and orchestration (Kubernetes, ECS) Familiarity with Infrastructure as Code (Terraform, CloudFormation) Experience with data processing frameworks (Spark, Dask) for large-scale workloads Understanding of data governance and compliance frameworks