NVIDIA
Technology
SeniorLLMAgentsArchitect
Neural analysis suggests this role is
optimal for Senior candidates.
“Senior LLM Agents Architect at NVIDIA. Skills: LLM Agents, GPU Architecture, CUDA Programming, System Design. Design agentic AI systems. Generate GPU compute kernels”
What You'll Achieve.
Drive significant improvements; Target speed-of-light performance; Rival hand-tuned optimization; Confident iterations; Cost control; Safe improvements
Industry & Context.
Analysis; Optimization; Identify bottlenecks; Propose mitigations; What-if analysis; Failure recovery
What They're Looking For.
Must Have
8+ years in applied ML/AI, 2+ years crafting agentic LLM applications, B. Sc in Computer Science / Electrical Engineering, Solid grounding in computer architecture, Familiarity with NVIDIA GPU architecture, Hands-on CUDA programming experience, Writing, profiling, and optimizing GPU kernels, Comfortable with Nsight Compute, Nsight Systems, Proven ownership of end-to-end agentic system, Software engineering skills in Python, Proficient in tool use, Proficient in RAG pipelines, Proficient in model adaptation techniques, Demonstrated ability to collaborate with HW/SW experts, Translate heuristics into deterministic tools, Translate heuristics into constraints, Translate heuristics into evaluation metrics, Excellence in communication and facilitation, Aligning diverse collaborators, Documenting decisions/assumptions, Influencing without authority, Track record of building observability for AI systems, Dataset/version management, Offline test suites, Online telemetry, Guardrails/safety checks, Rollback plans
Nice to Have
Familiarity with PyTorch compilation stack, Familiarity with GPU graph compilers, Familiarity with kernel fusion strategies, Familiarity with auto-tuning frameworks, Background in performance engineering, Experience with performance modeling, Experience with hardware simulators, Familiarity with distributed processing, Familiarity with multi-GPU workloads, Familiarity with networking, Familiarity with frontier agentic coding tools, Understanding of tool orchestration, Understanding of context management, Understanding of autonomous task execution, Hands-on experience building domain-specific coding agent, Experience with agent frameworks
What You'll Do.
Design agentic AI systems
Generate GPU compute kernels
Analyze GPU compute kernels
Optimize GPU compute kernels
Collaborate with GPU architects
Collaborate with performance engineers
Encode domain expertise into agent workflows
Build automated performance forensics agents
Ingest simulation traces
Ingest Nsight profiler data
Propose architectural mitigations
Propose software mitigations
Partner with HW architects
Develop agentic flows for GPU studies
Enable rapid what-if analysis
Explore agentic approaches to HW/SW co-design
Replace graph-compiler functionality
Augment graph-compiler functionality
Rapidly prototype solutions
Thoughtfully integrate with internal services
Utilize GPU capabilities
Deliver fitting solutions
Set up evaluation backbone
Implement safe improvements
How You'll Work.
Team & Collaboration
Hardware architects; Verification engineers; GPU performance experts; Software developers; HW/SW domain experts; Diverse collaborators
Communication Scope
Communication; Facilitation; Documentation; Playbooks
Process & Methodology
Requirements, Architecture, Implementation, Evaluation, Incremental hardening
Full Job Description
We don't just build the hardware and software that powers the AI revolution — we are building the AI that designs the next generation of both. Our team sits at the intersection of inference software and GPU architecture, creating autonomous LLM-driven systems that reason about hardware, write high-performance CUDA, and automate the complex loops of architectural simulation, analysis, and optimization. We are looking for a senior LLM Agents Architect to work hands-on with hardware architects, verification engineers, GPU performance experts, and software developers to build end-to-end agent flows that drive significant improvements in kernel optimization, architectural exploration, and developer efficiency. **What you 'll be doing:** * Design and build agentic AI systems that generate, analyze, and optimize GPU compute kernels — targeting speed-of-light performance on NVIDIA hardware. * Collaborate with GPU architects and performance engineers to encode domain expertise — memory hierarchy trade-offs, occupancy tuning, instruction-level reasoning — into agent workflows that rival hand-tuned optimization. * Build automated performance forensics agents capable of ingesting large-scale simulation traces and Nsight profiler data to identify bottlenecks and propose architectural or software mitigations. * Partner with HW architects to develop agentic flows for GPU architectural studies — enabling rapid what-if analysis across micro-architecture configurations such as cache sizing, memory controller design, and compute unit scaling. * Explore agentic approaches to HW/SW co-design challenges, including replacing or augmenting graph-compiler functionality (e.g., TorchInductor) with LLM-driven optimization and code-generation pipelines. * Rapidly prototype and thoughtfully productize; integrate with internal services, utilize GPU capabilities, remove bottlenecks, and deliver fitting solutions. * Set up evaluation backbone using offline golden sets and online telemetry for confi
Applying for this Senior LLM Agents Architect role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Workday
- Workday has a multi-step form — save your progress after every section.
- "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
- Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
- Job requisition numbers are useful when following up with HR by email.
ANONYMOUS · UNFILTERED
What do employees actually say about NVIDIA?
Real rants from real employees. Read before you apply.