NVIDIA
AI
SoftwareSolutionsEngineer
Neural analysis suggests this role is
optimal for Mid candidates.
“Software Solutions Engineer at NVIDIA. Skills: AI, Cloud, Datacenter, Kubernetes, Python, Troubleshooting, Customer Support. Support, triage and resolve complex customer software issues end-to-end. build software features, automation, diagnostics, reproducible test cases, and deployment tooling”
What You'll Achieve.
improve product readiness; scale support across enterprise environments; improve NVIDIA AI Enterprise deployment reliability; reduce recurring issues; improve time-to-value
Industry & Context.
Support, triage and resolve complex customer software issues end-to-end; debugging skills; troubleshooting fundamentals; structured approach to isolating issues; passion to solve problems
Be on call one weekend per month in the event a customer has a Sev1 outage and requires engineering assistance
What They're Looking For.
Must Have
BS in Computer Science, Electrical Engineering, Computer Engineering, or related field (or equivalent experience), At least 5+ years system software development and troubleshooting experience, computer science fundamentals, programming/scripting skills (Python Bash; Go/C++ a plus), troubleshooting fundamentals (networking, concurrency, OS concepts), structured approach to isolating issues across application, platform, and infrastructure layers, Deep understanding of at least two of the following: data centers/servers, distributed systems, virtualization, deep learning frameworks, containers (Docker/Kubernetes), hybrid cloud (AWS/Azure/GCP), and CI/CD for reliable deployments, Deep Linux knowledge, Professional-level communication skills, interpersonal skills, passion to solve problems
Nice to Have
some customer facing experience, working knowledge of Windows, Hands-on experience deploying and operating NVIDIA AI Enterprise components in production across on-prem or CSP environments, Hands-on experience using AI coding assistants/tools (e.g., Cursor, Claude Code, Codex, or similar) to accelerate debugging, automation, and test creation, Experience operating Kubernetes-based platforms in production (cluster operations, upgrades, control-plane/data-plane failure modes), performance debugging skills for GPU and cloud workloads (profiling, latency/throughput tuning), familiarity with observability/tracing tools
What You'll Do.
triage and resolve complex customer software issues end-to-end
build software features
reproducible test cases
and deployment tooling
Develop and maintain product-facing features and deployment assets for AI Enterprise supportability (e.g.
configuration guidance
Kubernetes manifests/Helm charts
and reproducible test cases)
Develop and maintain Python-based tooling/automation (validators
repro harnesses) to improve NVIDIA AI Enterprise deployment reliability across NGC and container orchestrators (e.g.
Contribute code-level fixes
or pull requests (as appropriate) in collaboration with engineering to address customer-impacting issues and improve product readiness
Support enterprise customers deploying NVIDIA AI Enterprise in datacenter and CSP environments
including Kubernetes-based and containerized production AI platforms
Take ownership of customer issues from inception to resolution
Create high-quality bug reports and RFEs with clear repro steps
environment details (CSP/Kubernetes/GPU)
and supporting artifacts
Develop customer-facing and internal documentation (KBs
How You'll Work.
Team & Collaboration
work closely with customers and internal engineering teams to understand issues, explain root causes, drive resolution, and collaborate on fixes and improvements; partner with engineering on fixes; in collaboration with engineering
Communication Scope
customer-facing role; crisp communication; Professional-level communication skills
Full Job Description
We are looking for a Software Solutions Engineer to support NVIDIA AI Enterprise customers and deployments across cloud and datacenter environments. This is a dual role: (1) Support, triage and resolve complex customer software issues end-to-end, and (2) build software features, automation, diagnostics, reproducible test cases, and deployment tooling—to improve product readiness and scale support across enterprise environments. You will work across compute and cloud-native technologies in CSP environments, including container platforms/orchestrators, enterprise system software, and GPU-accelerated AI frameworks and inference services used to run production AI workloads at scale. In this customer-facing role, you will work closely with customers and internal engineering teams to understand issues, explain root causes, drive resolution, and collaborate on fixes and improvements. Success in this role requires strong debugging skills, crisp communication, and ownership of technically deep escalations from inception to closure. **What you 'll be doing:** * Develop and maintain product-facing features and deployment assets for AI Enterprise supportability (e.g., scripts, configuration guidance, Kubernetes manifests/Helm charts, and reproducible test cases) * Develop and maintain Python-based tooling/automation (validators, log collectors, repro harnesses) to improve NVIDIA AI Enterprise deployment reliability across NGC and container orchestrators (e.g., Kubernetes) * Contribute code-level fixes, patches, or pull requests (as appropriate) in collaboration with engineering to address customer-impacting issues and improve product readiness * Support enterprise customers deploying NVIDIA AI Enterprise in datacenter and CSP environments, including Kubernetes-based and containerized production AI platforms * Take ownership of customer issues from inception to resolution: reproduce in lab/cloud, collect diagnostics, provide mitigations, and partner with engineering on fixes * C
Applying for this Software Solutions Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Workday
- Workday has a multi-step form — save your progress after every section.
- "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
- Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
- Job requisition numbers are useful when following up with HR by email.
ANONYMOUS · UNFILTERED
What do employees actually say about NVIDIA?
Real rants from real employees. Read before you apply.