NVIDIA
AI
SeniorAI-NativeSystemsSoftwareEngineer,TensorRT
Neural analysis suggests this role is
optimal for Senior candidates.
“Senior AI-Native Systems Software Engineer, TensorRT at NVIDIA. Skills: AI-Native Systems Software Engineering, TensorRT, Generative AI, Agentic development framework, Deep learning, C++, Inference. Architecting an AI-native framework. Scaling through agentic workflows”
What You'll Achieve.
make TensorRT the default entry point for out-of-framework inference globally; produce high-performance, high-quality, modern C++ software at an unprecedented scale; scale out an agentic development framework; improve users’ experience with lightning-fast model onboarding; codebase and architecture that scales beyond human capacity; Improve the ratio of compute-to-software output; keep humans focused on the highest-value work; prototype and integrate these capabilities into our framework; Ensure a seamless, high-performance path to production for the latest model families; achieve major latency and throughput gains for critical customer use cases
Industry & Context.
What They're Looking For.
Must Have
4+ years of relevant software development experience, modern C++ skills: Proficiency with C++11/14/17 (or newer) and the STL, with an emphasis on clean, maintainable, performant code, Deep learning familiarity: Experience with modern inference frameworks and an understanding of the architectural nuances of LLMs, Diffusion, and multi-modal models, Systems thinking: Interest in how software architecture must evolve to support automated, agent-driven development and indefinitely scaling codebases, End-to-end product sense: Ability to translate high-level customer needs into concrete technical requirements and user-centric solutions, Pragmatic execution: Demonstrated ability to go from customer requests to production-quality software on tight timelines
Nice to Have
Agentic framework experience: Hands-on work with AI agent orchestrators or multi-agent coding frameworks, or experience building custom agentic coding harnesses for production software, CUDA & kernel expertise: Experience with CUDA programming or exposure to kernel generation / autotuning efforts, High-velocity prototyping: A track record of rapidly turning state-of-the-art papers into working prototypes in days, not weeks, Performance profiling skills: Expertise in software performance analysis, profiling, and optimization (CPU and/or GPU), including using tooling to drive measurable wins
What You'll Do.
Architecting an AI-native framework
Scaling through agentic workflows
Rapid prototyping with SOTA models
Delivering a great user experience
Extreme performance optimization
How You'll Work.
Team & Collaboration
Comfort working across internal organizations and with customers
Communication Scope
Excellent communication skills
Full Job Description
Are you passionate about redefining how software is built in the age of Generative AI? Join NVIDIA’s TensorRT team to help lead a first-of-its-kind, AI-native initiative designed to make TensorRT the default entry point for out-of-framework inference globally. We are moving beyond traditional development cycles with a new framework built from the ground up to leverage swarms of AI agents to produce high-performance, high-quality, modern C++ software at an unprecedented scale. If you are a systems-thinking C++ engineer who wants to help scale out an agentic development framework, stay on top of state-of-the-art deep learning breakthroughs, and improve users’ experience with lightning-fast model onboarding, we want to hear from you! **What you 'll be doing:** * Architecting an AI-native framework: Help design and build a codebase and architecture that scales beyond human capacity, supporting large numbers of AI agents working in parallel to generate, test, and validate production-grade software. * Scaling through agentic workflows: Improve the ratio of compute-to-software output by adopting and building AI-native tools, multi-agent orchestrators, and codebase harnesses that keep humans focused on the highest-value work.. * Rapid prototyping with SOTA models: Act as a technical scout, identifying industry and academic breakthroughs (e.g., new attention mechanisms, KV cache strategies) and dispatching AI agent swarms to prototype and integrate these capabilities into our framework. * Delivering a great user experience: Ensure a seamless, high-performance path to production for the latest model families (LLMs, Diffusion, Audio, Vision and multi-modal models). * Extreme performance optimization: Work at the intersection of Python orchestration and C++ engine-level optimizations to achieve major latency and throughput gains for critical customer use cases. **What we need to see:** * BS, MS, or PhD in Computer Science, Computer Engineering, AI, or equivalent experience. * 4
Applying for this Senior AI-Native Systems Software Engineer, TensorRT role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Workday
- Workday has a multi-step form — save your progress after every section.
- "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
- Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
- Job requisition numbers are useful when following up with HR by email.
ANONYMOUS · UNFILTERED
What do employees actually say about NVIDIA?
Real rants from real employees. Read before you apply.