NVIDIA
AI
SeniorSystemSoftwareEngineer-Dynamo-TritonInferenceServer
Neural analysis suggests this role is
optimal for Senior candidates.
“Senior System Software Engineer - Dynamo-Triton Inference Server at NVIDIA. Skills: Rust, C++, deep learning software, high-scale distributed systems, ML systems, Triton Inference Server, NVIDIA Dynamo. Develop world-class GPU-accelerated AI inference serving software. Contribute to feature development”
What You'll Achieve.
drive broad customer adoption; establish a unified, high-performance inference platform; ensure feature parity and effectively serve both Large Language Model (LLM) and non-LLM workloads; building robust software designed to be deployed in production server or cloud environments; optimizing and balancing prediction throughput and latency; developing and adopting the next generation of inference technologies
Industry & Context.
What They're Looking For.
Must Have
MS or PhD in Computer Science or relevant field (or equivalent experience), 5+ years of professional experience working on deep learning software, Excellent Rust & C++ skills, programming & software design skills including debugging, performance analysis, and test design, Experience with high-scale distributed systems and ML systems
Nice to Have
Prior experience with AI frameworks and engines, such as TensorRT, PyTorch, ONNX, OpenVINO, vLLM, or TRT-LLM, Knowledge of GPU memory management, cache management, or high-performance networking, Experience with distributed systems programming, Experience in contributing to a large open source project: use of GitHub, bug tracking, branching and merging code, OSS licensing issues handling patches, etc.
What You'll Do.
Develop world-class GPU-accelerated AI inference serving software
Contribute to feature development
Drive broad customer adoption
Drive the convergence of the Triton Inference Server and NVIDIA Dynamo stacks to establish a unified
high-performance inference platform
Ensure feature parity and effectively serve both Large Language Model (LLM) and non-LLM workloads
Be an active member of the open source deep learning software engineering community
Balance a variety of objectives such as building robust software designed to be deployed in production server or cloud environments
optimizing and balancing prediction throughput and latency
and developing and adopting the next generation of inference technologies
How You'll Work.
Team & Collaboration
agile team environment
Communication Scope
communication skills
Full Job Description
We are looking for a Senior System Software Engineer to work on[ Dynamo-Triton Inference Server](https://developer.nvidia.com/dynamo-triton). NVIDIA is hiring software engineers for its GPU-accelerated deep learning software team. Academic and commercial groups around the world are using GPUs to power a revolution in AI, enabling breakthroughs in problems from image classification to speech recognition to natural language processing. We are a fast-paced team building a highly-performant AI inference platform to make design and deployment of new AI models easier and accessible to all users. **What you 'll be doing:** * Develop world-class GPU-accelerated AI inference serving software. * Contribute to feature development and drive broad customer adoption. * Drive the convergence of the Triton Inference Server and NVIDIA Dynamo stacks to establish a unified, high-performance inference platform. This platform will ensure feature parity and effectively serve both Large Language Model (LLM) and non-LLM workloads. * Be an active member of the[ open source deep learning software engineering community](https://github.com/triton-inference-server/server). * Balance a variety of objectives such as building robust software designed to be deployed in production server or cloud environments, optimizing and balancing prediction throughput and latency, and developing and adopting the next generation of inference technologies. **What we need to see:** * MS or PhD in Computer Science or relevant field (or equivalent experience). * 5+ years of professional experience working on deep learning software. * Excellent Rust & C++ skills, familiarity with Python, and strong programming & software design skills including debugging, performance analysis, and test design. * Experience with high-scale distributed systems and ML systems. * Strong communication skills and ability to work in a fast-paced, agile team environment. **Ways to stand out from the crowd:** * Prior experience with AI framew
Applying for this Senior System Software Engineer - Dynamo-Triton Inference Server role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Workday
- Workday has a multi-step form — save your progress after every section.
- "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
- Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
- Job requisition numbers are useful when following up with HR by email.
ANONYMOUS · UNFILTERED
What do employees actually say about NVIDIA?
Real rants from real employees. Read before you apply.