NVIDIA
SeniorDatacenterTechnicalProgramManager,At-ScaleAIClusters
Neural analysis suggests this role is
optimal for Senior candidates.
“Senior Datacenter Technical Program Manager, At-Scale AI Clusters at NVIDIA. Skills: Datacenter Technical Program Management, At-Scale AI Clusters, Systems Integration, High-Performance Computing. drive datacenter integration for the next generation of NVIDIA AI supercomputing systems. drive collaboration between engineering leaders across multiple hardware and software teams”
Industry & Context.
driving the process of finding a solution
What They're Looking For.
Must Have
BS in Applied Science or Engineering (or equivalent experience), 8+ years of overall experience, Experience with high-performance computing systems and GPU clusters deployed in on-premises datacenters, A passion for understanding challenging technical problems and driving the process of finding a solution, teamwork and interpersonal skills, to facilitate building a collaborative workflow for coordination between many teams
Nice to Have
Understanding of datacenter design, including familiarity with power and cooling technologies, Expertise in system monitoring and instrumentation of large clusters, using technologies such as Prometheus, Grafana, Splunk, Modbus, and BACNet, Experience working with the engineering or academic research community supporting high-performance computing or deep learning
What You'll Do.
drive datacenter integration for the next generation of NVIDIA AI supercomputing systems
drive collaboration between engineering leaders across multiple hardware and software teams
build AI supercomputers for NVIDIA engineers
develop reference architectures to advise customers and partners
build and deploy large scale GPU computing systems based on NVIDIA's reference supercomputing architectures
Lead the integration of new AI clusters with datacenter facilities with demanding requirements on power
Coordinate design and fit-out of new datacenter builds
working with both internal engineering teams and external contractors
Own and produce detailed documentation for the end-to-end process for datacenter fit-out and integration
Communicate internally with engineering leadership to prioritize and address key issues essential to the success of our largest customers
How You'll Work.
Team & Collaboration
drive collaboration between engineering leaders across multiple hardware and software teams; facilitate building a collaborative workflow for coordination between many teams; working with both internal engineering teams and external contractors
Communication Scope
Communicate internally with engineering leadership
Process & Methodology
Technical Program Management, lifecycle management, requirements definition, systems integration, coordination, prioritization
Full Job Description
NVIDIA is looking for a highly-motivated Technical Program Manager (TPM) to join our Applied Systems Engineering Team to drive datacenter integration for the next generation of NVIDIA AI supercomputing systems. This TPM will play a crucial role throughout the lifecycle of the latest AI systems at scale, from datacenter design and requirements definition, through systems integration of AI clusters into the datacenter environment, and support for these systems as they enter production. This role will drive collaboration between engineering leaders across multiple hardware and software teams, helping us work together to build AI supercomputers for NVIDIA engineers and develop reference architectures to advise customers and partners. **What you’ll be doing:** * Collaborate with outstanding engineers and architects to build and deploy large scale GPU computing systems based on NVIDIA's reference supercomputing architectures * Lead the integration of new AI clusters with datacenter facilities with demanding requirements on power, cooling, and instrumentation * Coordinate design and fit-out of new datacenter builds, working with both internal engineering teams and external contractors * Own and produce detailed documentation for the end-to-end process for datacenter fit-out and integration * Communicate internally with engineering leadership to prioritize and address key issues essential to the success of our largest customers **What we need to see:** * BS in Applied Science or Engineering (or equivalent experience) * 8+ years of overall experience * Experience with high-performance computing systems and GPU clusters deployed in on-premises datacenters * A passion for understanding challenging technical problems and driving the process of finding a solution * Strong teamwork and interpersonal skills, to facilitate building a collaborative workflow for coordination between many teams **Ways to stand out from the crowd:** * Understanding of datacenter design, including famil
Applying for this Senior Datacenter Technical Program Manager, At-Scale AI Clusters role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Workday
- Workday has a multi-step form — save your progress after every section.
- "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
- Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
- Job requisition numbers are useful when following up with HR by email.
ANONYMOUS · UNFILTERED
What do employees actually say about NVIDIA?
Real rants from real employees. Read before you apply.