NVIDIA

Artificial Intelligence

SeniorDeepLearningPerformanceArchitect

$184–357k Santa Clara, California, United States FULL TIME Remote Friendly
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Senior candidates.

The Brief

“Senior Deep Learning Performance Architect at NVIDIA. Skills: Deep Learning Performance Architecture, GPU/ASIC architecture, performance modeling, system performance engineering, LLM inference/training optimization. Design and evaluate hardware architectures to improve performance, efficiency, and scalability of production AI workloads. Analyze and optimize large-scale deep learning workloads, especially LLM inference/training in real-world deployments”

Industry & Context.

Artificial Intelligence
Problems you'll solve

Identify and resolve system bottlenecks

What They're Looking For.

Must Have

MS or PhD in a relevant field (Computer Science, Electrical Engineering, Computer Engineering, etc) or equivalent experience, 5+ years of hands-on experience in GPU/ASIC architecture, parallel computing, or system performance engineering, Experience with deep learning workloads in production environments (training and/or inference), Proficiency in Python and C++ for building performance models, simulators, or analysis tools, Solid understanding of system architecture: memory hierarchy, data movement, and scalability, Prior experience debugging, profiling, and performance tuning on real systems, Ability to work across team and drive decisions in fast-paced product environments

Nice to Have

Experience translating workload behavior into concrete hardware or system-level improvements, Practical experience with LLM inference optimization: batching, disaggregation, KV-cache management, latency/throughput tuning, Familiarity with production inference systems (e. g. , scheduling, multi-node scaling, resource utilization)

What You'll Do.

Design and evaluate hardware architectures to improve performance

and scalability of production AI workloads

Analyze and optimize large-scale deep learning workloads

especially LLM inference/training in real-world deployments

Build and use performance and power models (Python/C++) to drive architecture and product decisions

Identify and resolve system bottlenecks across compute

Evaluate PPA trade-offs and guide feature prioritization for next-generation GPU/ASIC designs

How You'll Work.

Team & Collaboration

Partner closely with software, systems, and product teams to align hardware capabilities with workload requirements; Ability to work across team and drive decisions in fast-paced product environments

Full Job Description

We are now looking for a Senior Deep Learning Performance Architect! NVIDIA is seeking outstanding Performance Architects to help analyze and develop the next generation of architectures that accelerate AI and high-performance computing applications. Intelligent machines powered by Artificial Intelligence computers that can learn, reason and interact with people are no longer science fiction. GPU Deep Learning has provided the foundation for machines to learn, perceive, reason and solve problems. NVIDIA's GPUs run AI algorithms, simulating human intelligence, and act as the brains of computers, robots and self-driving cars that can perceive and understand the world. Come, join our Deep Learning Architecture team, where you can help build real-time, cost-effective computing platforms driving our success in this exciting and rapidly growing field! **What you’ll be doing** * Design and evaluate hardware architectures to improve performance, efficiency, and scalability of production AI workloads. * Analyze and optimize large-scale deep learning workloads, especially LLM inference/training in real-world deployments. * Build and use performance and power models (Python/C++) to drive architecture and product decisions. * Identify and resolve system bottlenecks across compute, memory, and interconnect. * Evaluate PPA trade-offs and guide feature prioritization for next-generation GPU/ASIC designs. * Partner closely with software, systems, and product teams to align hardware capabilities with workload requirements. **What we need to see:** * MS or PhD in a relevant field (Computer Science, Electrical Engineering, Computer Engineering, etc) or equivalent experience. * 5+ years of hands-on experience in GPU/ASIC architecture, parallel computing, or system performance engineering. * Experience with deep learning workloads in production environments (training and/or inference). * Proficiency in Python and C++ for building performance models, simulators, or analysis tools. * Soli

Free ATS check

Applying for this Senior Deep Learning Performance Architect role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Workday

  • Workday has a multi-step form — save your progress after every section.
  • "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
  • Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
  • Job requisition numbers are useful when following up with HR by email.

ANONYMOUS · UNFILTERED

What do employees actually say about NVIDIA?

Real rants from real employees. Read before you apply.

Read Company Rants →