Scindo is building the next generation of enzyme-powered chemistry, combining wet-lab data with state-of-the-art machine learning. We are looking for a Machine Learning Scientist to design and deploy models that push the boundaries of enzyme prediction, reaction modelling, and generative catalyst design.
Essential requirements
* PhD (or equivalent) in Physics, Applied Mathematics, Computational Chemistry, or related field.
* Proven experience applying
machine learning to molecular systems
, e.g. protein engineering, enzyme catalysis, reaction prediction, molecular
de novo
design, molecular dynamics.
* Strong background working with
deep learning architectures
relevant to molecules/sequences:
-Transformers (e.g. ProtBERT, ESM, AlphaFold-like)
-Equivariant neural networks / GNNs (SchNet, DimeNet, SE(3)-Transformers)
-Generative models (diffusion, VAEs, autoregressive) for proteins, molecules or materials.
* Hands-on experience with
molecular dynamics and simulation data
; familiarity with force fields, ab initio methods, or enhanced sampling.
* Excellent mathematics foundation: linear algebra, optimisation, probability, statistical mechanics, PDE/ODE modelling.
* Strong programming in Python (PyTorch/TensorFlow, JAX, NumPy/SciPy); experience with scientific libraries such as RDKit, ASE, DeepChem.
Desirable skills
* Experience with
MLOps
and end-to-end large-scale model development. (e.g. training, evaluation, benchmarking and deployment)
* Familiarity with vector databases and embeddings (Qdrant, Milvus, FAISS) for chemical/sequence similarity search.
* HPC/GPU cluster experience, performance optimisation, distributed training.
* Background in spectroscopy (IR/UV/Vis/NMR) and/or computational thermodynamics/kinetics.
* Exposure to enzyme engineering, biocatalysis, or structural biology data.
What we offer
* Opportunity to build a machine learning stack from the ground up, with direct impact on real-world sustainable chemistry.
* A highly collaborative lab–computational environment: every model prediction is tested in-house, feeding back into data pipelines.
* Central London lab/office with a fast-growing interdisciplinary team.