While technology is the heart of our business, a global and diverse culture is the heart of our success. We love our people and we take pride in catering them to a culture built on transparency, diversity, integrity, learning and growth. If working in an environment that encourages you to innovate and excel, not just in professional but personal life, interests you- you would enjoy your career with Quantiphi! About Quantiphi: Quantiphi is an award-winning Applied AI and Big Data software and services company, driven by a deep desire to solve transformational problems at the heart of businesses. Our signature approach combines groundbreaking machine-learning research with disciplined cloud and data-engineering practices to create breakthrough impact at unprecedented speed. Quantiphi has seen 2.5x growth YoY since its inception in 2013, we don’t just innovate - we lead. Headquartered in Boston, with 4,000 professionals across the globe. Quantiphi leverages Applied AI technologies across multiple a. Industry Verticals (Telco, BFSI, HCLS etc.) and is an established Elite/Premier Partner of NVIDIA, Google Cloud, AWS, Snowflake, and others. We’ve been recognized with: 17x Google Cloud Partner of the Year awards in the last 8 years. 3x AWS AI/ML award wins. 3x NVIDIA Partner of the Year titles. 2x Snowflake Partner of the Year awards. We have also garnered top analyst recognitions from Gartner, ISG, and Everest Group. We have been certified as a Great Place to Work for the third year in a row- 2021, 2022, 2023. Be part of a trailblazing team that’s shaping the future of AI, ML, and cloud innovation. Your next big opportunity starts here! For more details, visit: Website or LinkedIn Page. Role: Platform Architect Experience Level: 7 y ears Employment Type: Full-Time or Contract with IMMEDIATE joining Work Location: London, UK or Remote Canada/US (Willing to Travel to UK) Role Summary: We are seeking a Platform Architect with deep expertise in GPU-based infrastructure to design, optimize, and scale next-generation platforms for Generative AI (GenAI) workloads. In this role, you will lead the architecture and performance optimization of large-scale AI systems, leveraging technologies such as Slurm, Red Hat OpenShift, and the NVIDIA GPU ecosystem. You will play a pivotal role in shaping the foundation of Quantiphi’s GenAI platform strategy while enabling customer-facing production deployments that drive real-world impact. Key Responsibilities: Architect scalable GenAI infrastructure to support LLM training, fine-tuning, and inference across multi-GPU systems. Perform GPU profiling and optimization, including benchmarking, workload parallelization, and distributed training performance tuning. Manage compute workloads on Slurm-based clusters and containerized environments such as Red Hat OpenShift or Kubernetes. Optimize the NVIDIA GPU software stack (CUDA, cuDNN, NCCL, Triton Inference Server, RAPIDS, TensorRT, etc.) for high-performance GenAI and deep learning workloads. Collaborate cross-functionally with data scientists, MLOps, and application engineering teams to deliver and operationalize large-scale AI models. Design secure, production-ready GenAI pipelines supporting fine-tuning, Retrieval-Augmented Generation (RAG), multi-modal inference, and LLMOps. Develop reusable infrastructure templates and automation (Terraform, Helm charts, Ansible) for consistent, GPU-ready environment provisioning. Lead internal enablement initiatives including proofs of concept, workshops, and capability development and translate them into client delivery success. Basic Qualifications (BQs) 8 years of experience as a Platform Engineer, Infrastructure Architect, or related role designing and managing large-scale compute environments. Proven expertise managing Slurm job scheduling and distributed training in large-scale GPU clusters. Strong hands-on experience with Red Hat OpenShift and/or Kubernetes orchestration. Deep understanding of the NVIDIA GPU ecosystem, including CUDA, cuDNN, NCCL, Nsight Systems, Triton, and TensorRT. Advanced Linux systems proficiency, including performance tuning and resource optimization in high-performance computing (HPC) environments. Experience deploying and optimizing GenAI workloads such as LLM fine-tuning, RAG pipelines, and multi-modal inference systems. Proficiency with Infrastructure-as-Code (IaC) tools like Terraform and Ansible. Familiarity with cloud GPU environments (GCP, AWS, Azure, OCI) as well as on-premise GPU infrastructure. Other Qualifications (OQs): Experience with NVIDIA NIMs, DGX systems, and GPU-accelerated container workflows. Knowledge of LLMOps and MLOps frameworks tailored for GenAI pipelines. Familiarity with vector databases and retrieval systems used in RAG architectures. Strong collaboration and communication skills, with experience in client-facing technical solutioning and pre-sales support. What’s in it for YOU at Quantiphi: Make an impact at one of the world’s fastest-growing AI-first digital engineering companies. Upskill and discover your potential as you solve complex challenges in cutting-edge areas of technology alongside passionate, talented colleagues. Work where innovation happens - work with disruptive innovators in a research-focused organization with 60 patents filed across various disciplines. Stay ahead of the curve, immerse yourself in breakthrough AI, ML, data, and cloud technologies and gain exposure working with Fortune 500 companies. If you like wild growth and working with happy, enthusiastic over-achievers, you'll enjoy your career with us !