Annapurna Labs (U. S. ) Inc.

Technology

SoftwareDevelopmentEngineerAI/ML,AWSNeuron,MultimodalInference

$129–224k Cupertino, California, United States FULL TIME

The Brief

“Software Development Engineer - AI/ML, AWS Neuron, Multimodal Inference at Annapurna Labs (U. S. ) Inc.. Skills: AI/ML, AWS Neuron, Inference Acceleration, High-performance computing. Build AWS Neuron SDK. Accelerate deep learning workloads”

Industry & Context.

Technology

Problems you'll solve

Troubleshooting performance issues

What They're Looking For.

Must Have

Experience optimizing inference performance, Experience with PyTorch, Experience with JAX, Software development using Python, System level programming, ML knowledge, Low-level optimization expertise, System architecture expertise, ML model acceleration expertise

Nice to Have

Familiarity with PyTorch, Familiarity with JIT compilation, Familiarity with AOT tracing, Familiarity with CUDA kernels, Performant kernel development, Familiar with Triton syntax, Experience with online/offline inference serving, Deep understanding of computer architecture, Deep understanding of operation systems level software, Working knowledge of parallel computing

What You'll Do.

Accelerate deep learning workloads

Accelerate GenAI workloads

Integrate ML frameworks

Enable ML inference performance

Enable ML training performance

Run wide range of models

Support novel architecture

Build systematic infrastructure

Create high-performance kernels

Fine tune compute unit

Push boundaries of AI acceleration

Optimize current performance

Contribute to future architecture designs

Enable customer models

Ensure optimal performance

Optimize their ML models

Provide direct support

Provide optimization expertise

Collaborate with open source ecosystems

Provide seamless integration

Bring peak performance

Tune LLM model families

Tune large language models

Create distributed inference solutions

Build distributed inference solutions

Tune distributed inference solutions

Lead efforts in building distributed inference support

Tune models for highest performance

Design machine learning models

Develop machine learning models

Optimize machine learning models

Design distributed computing architecture

Implement distributed computing architecture

Build infrastructure for model analysis

Build infrastructure for model onboarding

Design high-performance kernels

Implement high-performance kernels

Leverage Neuron architecture

Leverage Neuron programming models

Analyze system-level performance

Optimize system-level performance

Implement optimizations

Conduct comprehensive testing

Work directly with customers

Collaborate across teams

Develop innovative optimization techniques

Debug performance issues

Optimize memory usage

Shape inference stack

How You'll Work.

Team & Collaboration

Cross-functional team; Compiler engineers; Runtime engineers; Hardware teams; Applied scientists; System engineers; Product managers; Open Source Community

Process & Methodology

Agile

Free ATS check

Applying for this Software Development Engineer - AI/ML, AWS Neuron, Multimodal Inference role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

ANONYMOUS · UNFILTERED

What do employees actually say about Annapurna Labs (U. S. ) Inc.?

Real rants from real employees. Read before you apply.

Read Company Rants →