NVIDIA
Data Center Systems
SeniorEngineeringManager-ComputeServerBringUp
Neural analysis suggests this role is
optimal for Senior candidates.
“Senior Engineering Manager - Compute Server Bring Up at NVIDIA. Skills: Compute Server Bring Up, Systems/Platform Software Team Management, Server Architecture, Firmware Development, Data Center Solutions. Own Initial Power-On and Board Bring-Up. Form and lead a virtual team across NVIDIA software & firmware teams”
What You'll Achieve.
Ensure servers are fully functional and validated as per requirement before mass deployment in data centers; Ensure all functional requirements are met; Ensure appropriate validation done for boundary, stress, and regression testing; Confirm telemetry, logging, and hardware management features working as per requirements; Ensure usability, firmware IOS update coverage, and error reporting for reliable customer installation and operation; Ensure robust bring up, productization, and delivery; Delivering server products as per defined Key Performance Indicators (KPIs)
Industry & Context.
Root cause analysis and resolution of bring-up failures; Find creative solutions to complicated problems
What They're Looking For.
Must Have
5+ years of relevant experience managing systems/platform software teams, ideally in server bring up, firmware development, or data center solutions, Deep experience operating successfully in a matrix environment, forming and leading high impact virtual teams spanning multiple disciplines, 12+ overall years of experience, Knowledge of compute tray designs, firmware enablement, and system-level architecture, Proven track record of delivering scalable server products and solutions for large scale data centers, Experience collaborating with hardware, firmware, manufacturing, diags and QA teams, Experience with SCM (Git, Perforce) and project management tools (Jira), Hands-on experience with x86/ARM system architecture, Self-starter who loves to find creative solutions to complicated problems, Proven excellence in server architecture, collaborating across teams for delivering server products as per defined Key Performance Indicators (KPIs)
Nice to Have
Experience leading bring-up for sophisticated compute architectures like GB200 NVL72
What You'll Do.
Own Initial Power-On and Board Bring-Up
Form and lead a virtual team across NVIDIA software & firmware teams
and validation of firmware for all server components
Support manufacturing flows
and diagnostic procedures
Lead root cause analysis and resolution of bring-up failures
Own and maintain platform design guides
and install instructions
Drive product life cycles with QA teams
Conduct performance evaluations
Develop a culture of excellence
Ensure high productivity
How You'll Work.
Team & Collaboration
Form and lead a larger virtual team spanning across NVIDIA software & firmware teams; Ensure subject matter experts are available as needed throughout bringup; Regular reporting on status of bringup to provide visibility and ensure teams across the company are fully activated to help; Collaborate with partners, ODMs, and customers for technical support; Collaborating across teams for delivering server products
Communication Scope
Excellent written and oral communication skills; Regular reporting on status of bringup
Process & Methodology
Project management tools (Jira), Drive product life cycles
Full Job Description
NVIDIA data center systems have become core to NVIDIA's rapidly growing enterprise and cloud provider businesses. These platforms bring together the full power of NVIDIA GPUs, NVIDIA NVLink, NVIDIA Networking, NVIDIA Data Center CPUs, and a fully optimized NVIDIA AI and HPC software stack. We are seeking an excellent Senior Engineering Manager to lead the Compute Server Bring-Up team. This team is responsible for the bringup, integration, validation and troubleshooting for compute tray platforms of GPU Racks — ensuring servers are fully functional and validated as per requirement before mass deployment in data centers. You will directly lead all aspects of a group of bringup engineers and form a larger virtual team spanning across NVIDIA software & firmware teams to ensure successful bring up compute platforms both internally and with customers. **What you’ll be doing:** * Own Initial Power-On and Board Bring-Up: Lead the initial power-on and functional validation of compute trays (CPU, GPU, NIC, storage including NVMe, cooling, etc.) internally and with customers. Ensure all functional requirements are met. * Form and lead a virtual team across NVIDIA software & firmware teams to ensure subject matter experts are available as needed throughout bringup. Regular reporting on status of bringup to provide visibility and ensure teams across the company are fully activated to help. * Oversee flashing, updating, and validation of firmware for all server components as per defined architecture. Ensure appropriate validation done for boundary, stress, and regression testing, and confirm telemetry, logging, and hardware management features working as per requirements. Document pain points, bring up failures, recovery flows, and provide actionable feedback to hardware, firmware, and software teams. Ensure usability, firmware/BIOS update coverage, and error reporting for reliable customer installation and operation * Factory & Manufacturing Support: Support manufacturing flows,
Applying for this Senior Engineering Manager - Compute Server Bring Up role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Workday
- Workday has a multi-step form — save your progress after every section.
- "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
- Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
- Job requisition numbers are useful when following up with HR by email.
ANONYMOUS · UNFILTERED
What do employees actually say about NVIDIA?
Real rants from real employees. Read before you apply.