Annapurna Labs Ltd.
Technology
MLSoftwareEngineer,DataPlane
Neural analysis suggests this role is
optimal for Mid+ candidates.
“ML Software Engineer, Data Plane at Annapurna Labs Ltd.. Skills: ML inference, Custom hardware, LLM optimization. Develop compute kernels. Optimize compute kernels”
Industry & Context.
Identify bottlenecks; Drive performance improvements
What They're Looking For.
Must Have
Bachelor's degree or equivalent, 3+ years full software development life cycle, Coding standards experience, Code reviews experience, Source control management experience, Build processes experience, Testing experience, Operations experience, Knowledge of computer architecture, Knowledge of operating systems, Knowledge of parallel computing
Nice to Have
Knowledge of Machine Learning fundamentals, Knowledge of LLM fundamentals, Knowledge of transformer architecture, Knowledge of training/inference lifecycles, Knowledge of optimization techniques, Knowledge of ML frameworks, Experience deploying LLMs in production, Experience on GPUs, Experience on Neuron, Experience on TPU, Experience on other AI acceleration hardware
What You'll Do.
Develop compute kernels
Optimize compute kernels
Target production-level performance
Implement LLM architectures
Validate LLM architectures
Integrate accelerator backends
Implement model parallelism
Build test infrastructure
Maintain test infrastructure
Profile inference workloads
Optimize inference workloads
Instrument critical paths
Drive latency improvements
Drive throughput improvements
Own features end-to-end
Contribute to CI/CD pipelines
Full Job Description
The MLIL DataPlane team is looking for a Software Development Engineer to own the design and implementation of our inference data plane. We build the software that makes large models run efficiently on custom hardware - spanning model execution, memory management, data movement, and serving integration. Our work covers the full inference path: integrating serving engines with custom hardware, developing high-performance compute kernels, enabling efficient data movement, and driving models from early validation through production. We operate at frontier scale with large distributed models. This is a ground-up effort with rapidly evolving hardware and software. We are looking for an IC who can write and optimize low-level code for custom hardware, validate model architectures end-to-end, build test and profiling infrastructure, and drive performance across the stack. Key job responsibilities - Develop and optimize compute kernels for a custom ML accelerator architecture, targeting production-level performance for large language model inference. - Implement and validate LLM architectures (decoder-only, mixture-of-experts) end-to-end - from PyTorch model definition through distributed execution on custom hardware. - Integrate custom accelerator backends into open-source ML serving frameworks (vLLM, PyTorch), including scheduler extensions, memory management, and model parallelism. - Build and maintain test infrastructure for model correctness validation across CPU, GPU, simulator, and hardware targets. - Profile and optimize inference workloads - identify bottlenecks, instrument critical paths, and drive latency and throughput improvements from simulation through hardware bringup. - Own features end-to-end: from design through implementation, testing, and integration into the broader software stack. - Contribute to CI/CD pipelines that gate model and kernel changes on correctness and performance regressions. Basic Qualifications: - Bachelor's degree or equivalent - 3+ y
Applying for this ML Software Engineer, Data Plane role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
ANONYMOUS · UNFILTERED
What do employees actually say about Annapurna Labs Ltd.?
Real rants from real employees. Read before you apply.