We are seeking an experienced engineer to work on distributed AI/ML systems. This role involves working on collective operations—the fundamental operations that enable AI to scale across multiple accelerators and servers. Most of our stack is C/C++, and is relatively low level, so solid knowledge of Linux, kernels, and performant code is important. Experience with embedded systems, high-speed networking, or HPC interconnects is highly valued.
If you enjoy solving hard problems, want to work with HPC and ML customers, iterate quickly, and deliver impactful solutions at scale, then join us! This role places you at the forefront of AI/ML, working on features for the largest clusters, with major customers, for large AI models.
About the organization
You would join Annapurna Labs, an integral part of AWS, which develops hardware and software components critical for EC2 infrastructure. Every EC2 instance runs hardware designed by Annapurna Labs. We focus on designing software, systems, and chips that optimize the AWS customer experience.
A day in the life
At Annapurna Labs, responsible for hardware and software components for EC2 infrastructure, our team builds networking solutions for ML and HPC workloads on AWS. You will collaborate with infrastructure experts, hardware engineers, RTL engineers, scientists, and architects. Our team is global and diverse, fostering an environment of knowledge sharing and mentorship. We value mentorship, code reviews, and career development, encouraging continuous learning in the rapidly evolving AI/ML field.
Team culture and growth
We support new members with mentorship and tailored projects to develop your skills. We value diverse experiences and encourage candidates from non-traditional backgrounds to apply. AWS is the world's most comprehensive cloud platform, committed to innovation and customer trust. We promote an inclusive culture through employee-led affinity groups and learning events. We prioritize work-life balance, offering flexible hours and supporting your personal well-being. Our focus on mentorship and growth aims to help you become a well-rounded professional.
Minimum qualifications
1. 5+ years of professional software development experience (non-internship)
2. 5+ years of programming experience in at least one software language
3. 5+ years of experience in designing or architecting systems (design patterns, reliability, scalability)
4. 5+ years of experience across the full software development lifecycle (coding standards, code reviews, source control, build, testing, operations)
5. Experience as a mentor, tech lead, or leading an engineering team
Educational requirements
Bachelor's degree in computer science or equivalent.
Additional information
Amazon is an equal opportunity employer. For Los Angeles County applicants, the role includes specific safety, communication, and legal requirements. We consider qualified applicants with arrest and conviction records. If you need workplace accommodations, visit https://amazon.jobs/content/en/how-we-hire/accommodations. Our compensation varies by geographic market, with a range from $151,300 to $261,500 annually, plus potential equity, sign-on bonuses, and benefits. This position remains open until filled. Apply via our career site.
#J-18808-Ljbffr