NVIDIA

SyntheticDataGenerationandUserSimulationPhDResearchInternFall2026

$0–0k Angers, Pays de la Loire, France FULL TIME Remote Friendly
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Entry candidates.

The Brief

“Synthetic Data Generation and User Simulation PhD Research Intern — Fall 2026 at NVIDIA. Skills: generative models, synthetic data generation, user simulation, LLM training. Research innovative techniques. Craft and apply new methods”

What You'll Achieve.

measurably improves downstream model performance

Industry & Context.

Problems you'll solve

investigating how generative models can create instructional and assessment data; grounding data in real-world distributions; population-grounded user simulation; verifier-grounded trajectory synthesis; SDG quality measurement

What They're Looking For.

Must Have

PhD in Computer Science, Machine Learning specialization, Computational Linguistics specialization, Computational Neuroscience specialization, deep learning specialization, NLP specialization, LLM training specialization, Research experience in generative modeling, Research experience in synthetic data generation, Research experience in LLM post-training, Research experience in reward modeling, Research experience in multi-agent simulation, Research experience in interactive simulation, Research experience in behavioral modeling, Research experience in cognitive modeling, Research experience in large-scale data curation, Python programming skills, deep learning frameworks experience, modern LLM training stack experience, modern LLM serving stack experience, research background with publications

Nice to Have

Experience training LLMs end-to-end, Experience fine-tuning LLMs end-to-end, Evaluating LLMs against real downstream tasks, LLM-as-judge calibration experience, Inter-rater agreement experience, Evaluator robustness experience, User simulation experience, Agent–user interaction modeling experience, Behavioral modeling experience grounded in real population data, Behavioral modeling experience grounded in cognitive science, Interest in multilingual evaluation, Interest in low-resource evaluation, Interest in sovereign-AI evaluation, Interest in multilingual training, Interest in low-resource training, Interest in sovereign-AI training, Contributions to open-source projects

What You'll Do.

Research innovative techniques

Craft and apply new methods

Collaborate with researchers and engineers

Prepare research findings

How You'll Work.

Team & Collaboration

Collaborating with other researchers and engineers

Communication Scope

internal presentations

Full Job Description

Today, NVIDIA is tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, encouraging environment where everyone is inspired to do their best work. Come join the team and see how we can make a lasting impact on the world. We're a research team dedicated to a major challenge in modern model development. It involves advanced artificial data creation across pre-training, post-training, and evaluation infrastructure. Collecting only real data at scale carries meaningful quality, cost, latency, and privacy tradeoffs; it tends to overrepresent certain populations; and it often leaves gaps on the long tail of languages, domains, demographics, and safety scenarios. We're investigating how generative models can create instructional and assessment data that shows high utility. The measurement is based on downstream model performance instead of surface plausibility. Additionally, we explore grounding that data in real-world distributions to ensure it generalizes. A major workstream within this agenda is population-grounded user simulation: synthetic users interacting with LLMs, calibrated against real behavioral signatures, and structured to yield training signals (SFT examples, preference pairs, verifier corpora, process reward models, on-policy RL environments). Other examples include verifier-grounded trajectory synthesis where ground truth exists, multilingual and low-resource coverage, and SDG quality measurement across pre- and post-training corpora. This is an opportunity to contribute to foundational research that will help shape how the next generation of AI models is trained. **What you 'll be doing:** * Researching innovative techniques in generative models, artificial data creation, user si

Free ATS check

Applying for this Synthetic Data Generation and User Simulation PhD Research Intern — Fall 2026 role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Workday

  • Workday has a multi-step form — save your progress after every section.
  • "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
  • Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
  • Job requisition numbers are useful when following up with HR by email.

ANONYMOUS · UNFILTERED

What do employees actually say about NVIDIA?

Real rants from real employees. Read before you apply.

Read Company Rants →