NVIDIA

AI Computing

AIandSystemsSoftwareIntern,AtScaleAI-Fall2026

$0–0k Santa Clara, California, United States FULL TIME
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Entry candidates.

The Brief

“AI and Systems Software Intern, At Scale AI - Fall 2026 at NVIDIA. Skills: AI, Systems Software, Debugging, Infrastructure. Investigate and triage failures. Perform deep-dive analysis”

What You'll Achieve.

Reduce noise and identify root causes; Drive infrastructure improvements; Ensure jobs run as fast and reliably as possible; Make intelligent, data-backed engineering decisions

Industry & Context.

AI Computing
Problems you'll solve

Problem-solving skills; Analytical skills; Isolate issues in complex, distributed systems; Identify root causes

What They're Looking For.

Must Have

Python, Bash/Shell scripting, Debugging skills, High-performance computing (HPC) environments, Cluster managers, Large-scale distributed systems

Nice to Have

Server architecture, PCIe, NVLink, CPU/GPU interactions, Hardware diagnostics, Monitoring and logging tools, Prometheus, Grafana, ELK stack, System profiling tools, strace, gdb, perf, Industry benchmarks on Linux systems

What You'll Do.

Investigate and triage failures

Perform deep-dive analysis

Analyze logs and telemetry

Correlate job failures to issues

Track and report reliability metrics

Analyze workload issues

Search for improvement opportunities

Work closely with mentor

How You'll Work.

Team & Collaboration

Work closely with a mentor; Interact with OS, container technologies, GPU compute, and systems specialists; Work with scientific researchers, developers, and customers

Communication Scope

Communication skills

Full Job Description

Our work at NVIDIA is dedicated towards a computing model focused on visual and AI computing. For two decades, NVIDIA has pioneered visual computing, the art and science of computer graphics, with our invention of the GPU. The GPU has also shown to be spectacularly effective at solving some of the most complex problems in computer science. Today, NVIDIA’s GPU simulates human intelligence, running deep learning algorithms and acting as the brain of computers, robots and self-driving cars that can perceive and understand the world. We are looking to grow our company and teams with the smartest people in the world and there has never been a more exciting time to join our team! NVIDIA is looking for an intern for an exciting role in AI and Systems Software for datacenter applications. You will be deeply involved in system-level debugging, analyzing our large-scale infrastructure reliability, and correlating complex failure modes to underlying hardware or system issues. We are working with the latest Accelerated Computing and Deep Learning software and hardware platforms, along with many scientific researchers, developers, and customers to craft improved workflows and develop new, leading differentiated solutions. Our team interacts with OS, container technologies, GPU compute, and systems specialists to architect, develop and bring up large scale performance software components and optimize performance. **What you’ll be doing :** * Investigate and triage failures within large-scale compute clusters, performing deep-dive analysis to distinguish between software glitches, configuration errors, and hardware faults. * Analyze logs and telemetry to correlate specific job failures to system-level issues and diagnostic test failures, helping to reduce noise and identify root causes. * Assist with the tracking, calculation, and reporting on key reliability metrics, specifically Mean Time Between Failures (MTBF) and Mean Time Between Interruptions (MTBI), to drive infrastructure

Free ATS check

Applying for this AI and Systems Software Intern, At Scale AI - Fall 2026 role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Workday

  • Workday has a multi-step form — save your progress after every section.
  • "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
  • Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
  • Job requisition numbers are useful when following up with HR by email.

ANONYMOUS · UNFILTERED

What do employees actually say about NVIDIA?

Real rants from real employees. Read before you apply.

Read Company Rants →