NVIDIA

AI Computing

AIComputingDevelopmentEngineer,TensorRTandTensorRT-LLM

Shanghai, China FULL TIME

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid candidates.

The Brief

“AI Computing Development Engineer, TensorRT and TensorRT-LLM at NVIDIA. Skills: TensorRT, TensorRT-LLM, AI Computing, Deep Learning. Design inferencing software. Develop inferencing software”

Industry & Context.

AI Computing

Problems you'll solve

solve complex problems

What They're Looking For.

Must Have

Masters or higher degree in Computer Engineering, Computer Science, Applied Mathematics, or related computing-focused field (or equivalent experience), C/C++ or Python programming and software design experience, debugging, performance profiling, test design, 2+ years working experience, curiosity about artificial intelligence, familiarity with the latest developments in deep learning, generative models, multimodal systems, large neural networks, Experience working with deep learning frameworks, Proactive, self-driven, able to work independently, Excellent written and verbal communication skills in English, Demonstrated ability, commensurate with experience, to take technical ownership, solve complex problems, contribute effectively in cross-functional environments

Nice to Have

PyTorch, TensorRT/TensorRT-LLM, NeMo, vLLM

What You'll Do.

Design inferencing software

Develop inferencing software

Optimize software performance

Perform performance analysis

Optimize inference workloads

Tune inference workloads

Track AI advancements

Integrate AI advancements

Update TensorRT/TensorRT-LLM

Shape direction of ML inferencing

Deliver technical work

Publish technical results

How You'll Work.

Team & Collaboration

Collaborate across hardware, software, and research teams; contribute effectively in cross-functional environments

Communication Scope

Excellent written and verbal communication skills in English

Full Job Description

NVIDIA is hiring software engineers for its AI Computing team. Academic and commercial groups around the world are using GPUs to power a revolution in deep learning-powered AI, enabling breakthroughs in areas like generative AI, computer vision, speech recognition, recommender systems, and large-scale language and multimodal models. Join the team building the inferencing software (TensorRT/TensorRT-LLM) that will be used across our product lines. The ability to work in a fast-paced, delivery-focused environment is required, and excellent interpersonal skills are a must. **What you 'll be doing:** * Design and develop robust inferencing software (TensorRT/TensorRT-LLM) optimized for functionality and performance across platforms * Perform performance analysis, optimization, and tuning of deep learning inference workloads * Track and integrate academic and industry advancements in AI and feature-update TensorRT/TensorRT-LLM accordingly * Provide feedback into architecture and hardware design and development * Collaborate across hardware, software, and research teams to shape the direction of machine learning inferencing across NVIDIA platforms * Own and deliver technical work with scope based on experience, ranging from complex features to substantial parts of larger projects, with increasing independence and technical leadership over time * Publish key technical results at leading scientific and engineering conferences **What we need to see:** * Masters or higher degree in Computer Engineering, Computer Science, Applied Mathematics, or related computing-focused field (or equivalent experience) * Strong C/C++ or Python programming and software design experience, including debugging, performance profiling, and test design * 2+ years working experience * Strong curiosity about artificial intelligence and familiarity with the latest developments in deep learning — including generative models, multimodal systems, and large neural networks * Experience working with deep le

Free ATS check

Applying for this AI Computing Development Engineer, TensorRT and TensorRT-LLM role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 17 detected · ranked by frequency

TensorRT ×4

TensorRT-LLM ×4

C/C++ programming ×3

Python programming ×3

software design ×3

debugging ×3

performance profiling ×3

test design ×3

AI Computing ×2

Deep Learning ×2

C/C++

Python

PyTorch

NeMo

vLLM

technical ownership

cross-functional environments

Role Details

Seniority mid

Experience 2–5 yrs

Level Mid

Type FULL TIME

Education Masters or higher degree in Computer Engineering, Computer S

AI-Extracted Insights

Domain Areas

deep-learninggenerative-aicomputer-visionspeech-recognitionrecommender-systemslarge-scale-language-modelsmultimodal-modelsmachine-learning-inferencing

How to Apply on Workday

Workday has a multi-step form — save your progress after every section.
"Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
Job requisition numbers are useful when following up with HR by email.

ANONYMOUS · UNFILTERED

What do employees actually say about NVIDIA?

Real rants from real employees. Read before you apply.

Read Company Rants →