Job Summary
We are seeking a highly motivated performance engineer to join our workload modelling team to work on design and implementation of methodologies and tools for workloads modelling and simulation with primary focus on AI. As a performance engineer you will work on characterization of workloads, defining methodology of tracing and reduction of large-scale AI models for various content to speed‑up simulation and performance projection. Additionally, this role will include architectural studies and software/hardware co‑optimization to define requirements for upcoming processors and accelerators.
Senior Technical Expert with a profound background in workload modelling and CPU/NPU Architecture.
Key Responsibilities
Investigate cutting‑edge, high‑performance server CPU/NPU core and SoC architecture design, contributing vital data support for crucial decision‑making processes.
Design and execute the implementation of relevant tool systems for the exploration of architecture and the analysis of performance.
Develop strategies for software/hardware co‑optimization features and lead the integration of software and hardware components for the next‑generation processor.
Construct a non‑intrusive, highly accurate system for characterizing and modelling complex workloads, ensuring precise workload representation.
Analyse and extract the distinctive features of real‑world scenario workloads, delivering essential insights to our in‑house chip development department.
Required
Extensive industry experience in workload modelling and the development of CPU/NPU architecture.
Skilled in performance projection and architectural exploration using SoC simulators.
Proficient in the development of slicing tools.
Experience developing and utilizing performance simulators, including GEM5 (O3 model), Sniper, and others.
Proficient in benchmark analysis and characterization.
Experience in GPGPU performance analysis.
Great knowledge of theory and practice of deep learning, computer vision, natural language processing, or computer graphics.
Strong programming skills in languages such as C++ and Python. Experience with frameworks like TensorFlow, PyTorch.
Strong grasp of binary analysis, and software/hardware co‑optimization techniques.
Excellent collaboration and interpersonal skills.
Considered a Plus
Experience in developing for QEMU and DynamoRIO (or x86 PIN).
Experience with CUDA or OpenCL programming.
What We Offer
33 days annual leave entitlement per year (including UK public holidays)
Group Personal Pension
Life insurance
Private medical insurance
Medical expense claim scheme
Employee Assistance Program
Cycle to work scheme
Company sports club and social events
Additional time off for learning and development
#J-18808-Ljbffr