NVIDIA

SeniorAppliedDeepLearningScientist-LargeVisionLanguageModels

Zurich, Switzerland FULL TIME
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Senior candidates.

The Brief

“Senior Applied Deep Learning Scientist - Large Vision Language Models at NVIDIA. Skills: Deep Learning, Large Vision Language Models, Multimodal Language Models, LLMs, VLMs, PyTorch, Python. Push the boundaries of the NVIDIA Nemotron Omni family of models to enable powerful downstream applications, including document intelligence, mathematical reasoning, multi-turn multimodal dialogue systems, and advanced software & agentic assistants.. Span the full pipeline, from pre-training through post-tra”

What You'll Achieve.

Deliver tangible impact; Solve real-world problems; Turn research into reality; Create models that perform exceptionally in real-world applications right out of the box; Empowering and advancing the broader multimodal LLM ecosystem

What They're Looking For.

Must Have

M. Sc. or Ph. D. in Computer science (or a related field), or equivalent research experience in LLMs, systems, or connected areas., 10+ years of industry experience in computer vision, including designing data pipelines for diverse data modalities and deploying models from research into production., understanding of the theoretical foundations of LLMs/VLMs and familiarity with the latest academic developments in the field., Solid hands-on coding skills with PyTorch and Python, experience with multi-GPU training on large-scale compute clusters, fluency with Docker, and Linux systems expertise.

Nice to Have

Contributions to open-source LLM systems or large-scale AI infrastructure., Previous AI-related projects or entrepreneurial experience in a closely connected domain., An academic track record of publications in deep learning.

What You'll Do.

Push the boundaries of the NVIDIA Nemotron Omni family of models to enable powerful downstream applications

including document intelligence

mathematical reasoning

multi-turn multimodal dialogue systems

and advanced software & agentic assistants.

Span the full pipeline

from pre-training through post-training.

Prepare large-scale multimodal datasets to train cutting-edge foundation models across text

Develop robust data processing pipelines to curate high-quality training data

synthetically generating labels and providing the infrastructure to load and serve data in real time.

How You'll Work.

Team & Collaboration

Collaborate globally with other team members, researchers and developers from different departments at NVIDIA and AI startups we work with, to turn research and innovations into impactful products.

Full Job Description

We are looking for a highly motivated Senior Applied Deep Learning Scientist with a passion for multimodal language models. Join our world-class NVIDIA team, spanning Finland, Germany, the Netherlands, and the USA, behind pioneering work such as [_Megatron-Energon_](https://github.com/NVIDIA/Megatron-Energon) , [_Nemotron 3 Nano Omni_](https://developer.nvidia.com/blog/nvidia-nemotron-3-nano-omni-powers-multimodal-agent-reasoning-in-a-single-efficient-open-model/) and our latest [_post-training datasets_](https://huggingface.co/datasets/nvidia/Nemotron-Image-Training-v3)! As a core contributor to NVIDIA’s Nemotron multimodal initiative, we are pushing the frontiers of state-of-the-art open-source multimodal models. We have a unique perspective in that we strive for open models, open weights, open data. Our mission is straightforward: create models that perform exceptionally in real-world applications right out of the box, while empowering and advancing the broader multimodal LLM ecosystem. As an applied research group, we prioritize delivering tangible impact and solving real-world problems. We’re most excited when deep learning moves beyond theory into production at scale. If you share a pragmatic, delivery-focused perspective and care about turning research into reality, this team will feel like the right home! **What you will be doing:** * Push the boundaries of the NVIDIA Nemotron Omni family of models to enable powerful downstream applications, including document intelligence, mathematical reasoning, multi-turn multimodal dialogue systems, and advanced software & agentic assistants. The role spans the full pipeline, from pre-training through post-training. * Help us prepare large-scale multimodal datasets to train cutting-edge foundation models across text, image, audio and video. This includes developing robust data processing pipelines to curate high-quality training data, augmenting it, synthetically generating labels and providing the infrastructure to load

Free ATS check

Applying for this Senior Applied Deep Learning Scientist - Large Vision Language Models role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Workday

  • Workday has a multi-step form — save your progress after every section.
  • "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
  • Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
  • Job requisition numbers are useful when following up with HR by email.

ANONYMOUS · UNFILTERED

What do employees actually say about NVIDIA?

Real rants from real employees. Read before you apply.

Read Company Rants →