NVIDIA
Datacenters
SeniorDeepLearningSystemsEngineer,Datacenters
Neural analysis suggests this role is
optimal for Senior candidates.
“Senior Deep Learning Systems Engineer, Datacenters at NVIDIA. Skills: Deep Learning, System Software, Silicon Architecture, Performance Modeling, Performance Analysis, C/C++, Python. analyze the performance and power consumption of deep learning applications on datacenter-class hardware. influence the design and optimization of datacenters”
What You'll Achieve.
getting the most out of our exponentially growing datacenter deployments; establishing a data-driven approach to hardware design and system software development; significantly influence the design and optimization of datacenters; optimize our next generation systems and Deep Learning Software Stack
Industry & Context.
performance analysis; efficiency improvement
What They're Looking For.
Must Have
Bachelor's degree in Electrical Engineering or Computer Science or equivalent experience, 8 years or more of relevant experience, Experience in at least one of the following: System Software: Operating Systems (Linux), Compilers, GPU kernels (CUDA), DL Frameworks (PyTorch, TensorFlow), Experience in at least one of the following: Silicon Architecture and Performance Modeling/Analysis: CPU, GPU, Memory or Network Architecture, Experience programming in C/C++ and Python, A deep understanding of computer system architecture and performance analysis, Demonstrated hands-on experience in computer system architecture and performance analysis, Demonstrated ability to work in virtual environments, A drive to own tasks from beginning to end
Nice to Have
Masters or PhD degree preferred, Exposure to Containerization Platforms (docker), Exposure to Datacenter Workload Managers (slurm), Prior experience with virtual environments, Background with system software, Operating system intrinsics, GPU kernels (CUDA), or DL Frameworks (PyTorch, TensorFlow), Experience with silicon performance monitoring or profiling tools (e. g. perf, gprof, nvidia-smi, dcgm), In depth performance modeling experience in any one of CPU, GPU, Memory or Network Architecture, Exposure to Containerization Platforms (docker), Exposure to Datacenter Workload Managers (slurm), Prior experience with multi-site teams or multi-functional teams
What You'll Do.
analyze the performance and power consumption of deep learning applications on datacenter-class hardware
influence the design and optimization of datacenters
and IO relate to deep learning (DL) architectures for Natural Language Processing
Autonomous Driving and other technologies
optimize next generation systems and Deep Learning Software Stack
develop software infrastructure to characterize and analyze a broad range Deep Learning applications
evolve cost-efficient datacenter architectures tailored to meet the needs of Large Language Models (LLMs)
develop analysis and profiling tools in Python
bash and C++ to measure key performance metrics of DL workloads running on Nvidia systems
analyze system and software characteristics of DL applications
develop analysis tools and methodologies to measure key performance metrics and to estimate potential for efficiency improvement
How You'll Work.
Team & Collaboration
Work with experts; Prior experience with multi-site teams or multi-functional teams
Process & Methodology
drive to own tasks from beginning to end
Full Job Description
As NVIDIA makes inroads into the Datacenter business, our team plays a central role in getting the most out of our exponentially growing datacenter deployments as well as establishing a data-driven approach to hardware design and system software development. The role of a Deep Learning Systems Engineer would be to analyze the performance and power consumption of deep learning applications on datacenter-class hardware and significantly influence the design and optimization of datacenters. Do you want to influence the development of high-performance Datacenters designed for the future of AI? Do you have an interest in system architecture and performance? In this role you will find how CPU, GPU, networking, and IO relate to deep learning (DL) architectures for Natural Language Processing, Computer Vision, Autonomous Driving and other technologies. Come join our team, and bring your interests to help us optimize our next generation systems and Deep Learning Software Stack. **What you 'll be doing:** * Help develop software infrastructure to characterize and analyze a broad range Deep Learning applications * Evolve cost-efficient datacenter architectures tailored to meet the needs of Large Language Models (LLMs). * Work with experts to help develop analysis and profiling tools in Python, bash and C++ to measure key performance metrics of DL workloads running on Nvidia systems. * Analyze system and software characteristics of DL applications. * Develop analysis tools and methodologies to measure key performance metrics and to estimate potential for efficiency improvement. **What we need to see:** * A Bachelor’s degree in Electrical Engineering or Computer Science or equivalent experience (Masters or PhD degree preferred). * 8 years or more of relevant experience. * Experience in at least one of the following: * System Software: Operating Systems (Linux), Compilers, GPU kernels (CUDA), DL Frameworks (PyTorch, TensorFlow). * Silicon Architecture and Performance Modeling/Ana
Applying for this Senior Deep Learning Systems Engineer, Datacenters role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Workday
- Workday has a multi-step form — save your progress after every section.
- "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
- Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
- Job requisition numbers are useful when following up with HR by email.
ANONYMOUS · UNFILTERED
What do employees actually say about NVIDIA?
Real rants from real employees. Read before you apply.