Contract Type
6 month contract outsourced via agency on an hourly rate
Location
Egham
Hybrid
3 days onsite (minimum) and 2 days working from home
Rate
Very much dependant on level of experience.
Key responsibilities include
* Performance Optimization : Profile and debug performance bottlenecks at the OS, runtime, and model levels.
* Model Deployment : Work across the stack—from model conversion, quantization, and optimization to runtime integration of AI models on-device.
* Toolchain Evaluation : Compare deployment toolchains and runtimes for latency, memory, and accuracy trade-offs.
* Open-Source Contribution : Enhance open-source libraries by adding new features and improving capabilities.
* Experimentation & Analysis : Conduct rigorous experiments and statistical analysis to evaluate algorithms and systems.
* Prototyping : Lead the development of software prototypes and experimental systems with high code quality.
* Collaboration : Work closely with a multidisciplinary team of researchers and engineers to integrate research findings into products.
We not require a PhD holder this time which is unusual for the AI Team.
We're looking for someone with
* Technical Expertise : Strong OS fundamentals (memory management, multithreading, user / kernel mode interaction) and expertise in ARM CPU architectures.
* Programming Skills : Expert proficiency in Python and Rust, with desirable knowledge in C and C++.
* AI Knowledge : Solid understanding of machine learning and deep learning fundamentals, including architectures and evaluation metrics.
* Problem-Solving : Strong analytical skills and the ability to design and conduct rigorous experiments.
* Team Player : Excellent communication and collaboration skills, with a results-oriented attitude
Desirable Skills
* Experience with ARM 64-bit architecture and CPU hardware architectures.
* Knowledge of trusted execution environments (confidential computing).
* Hands-on experience with deep learning model optimization (quantization, pruning, distillation).
* Familiarity with lightweight inference runtimes (ExecuTorch, llama.cpp, Candle).
#J-18808-Ljbffr