Rhoda AI

Technology

ResearchScientist/Engineer-VideoGenerationModeling

$180–270k ~AI est. Palo Alto, California, United States FULL TIME
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Senior candidates.

The Brief

“Research Scientist / Engineer - Video Generation Modeling at Rhoda AI. Skills: Video generation modeling, Large-scale pre-training, Robot control, Video prediction. Design large-scale causal video generation models. Train large-scale causal video generation models”

Industry & Context.

Technology
Problems you'll solve

Identify high-leverage questions; Cut through noise

What They're Looking For.

Must Have

Large-scale generative modeling background, Hands-on experience training large generative models, Deep understanding of autoregressive modeling, Deep understanding of causal architectures, Deep understanding of scaling behavior, Fluency with modern ML frameworks, Ability to design experiments, Ability to interpret results, Ability to iterate quickly, Research taste, Comfort operating in a fast-moving startup environment

Nice to Have

PhD in ML, CS, Robotics, Equivalent research/industry experience, Publication record at top-tier ML and robotics venues, Prior work on video generation models, Experience with large-scale autoregressive language model pretraining, Experience with scaling autoregressive language models, Familiarity with web-scale video datasets, Familiarity with video data curation pipelines, Prior work connecting video generation to control, Prior work connecting video generation to action prediction, Prior work connecting video generation to robotic learning

What You'll Do.

Design large-scale causal video generation models

Train large-scale causal video generation models

Develop training objectives

Validate training objectives

Develop model architectures

Validate model architectures

Develop data mixtures

Validate data mixtures

Research scaling laws

Research data efficiency

Investigate web video properties

Build systematic evaluations

Measure video generation quality

Measure long-horizon prediction fidelity

Measure downstream robot task performance

Run rigorous ablations

Collaborate with data teams

Collaborate with evaluation teams

Collaborate with post-training teams

Collaborate with training systems teams

Translate research ideas into systems

How You'll Work.

Team & Collaboration

Data teams; Evaluation teams; Post-training teams; Training systems teams

Full Job Description

At Rhoda AI, we’re building the next generation of generalist intelligent robots. We own the full robotics stack from high-performance hardware and robot systems to the infrastructure and state-of-the-art foundation world models that control our robots. Our robots are designed to be generalists capable of operating in complex, real-world environments and handling long-tail edge cases, made possible by our cutting edge research and end-to-end system design. We've raised over $400M and are investing aggressively in model research, infrastructure, hardware development, and manufacturing scale-up to make generalist robotics a reality. We're looking for Research Scientists and Research Engineers to push the frontier of large-scale pre-training for our video action model. Our approach formulates robot control as video prediction — we pre-train causal video generation models on web-scale video data, then adapt them to predict robot actions from real-world demonstrations. You'll work on the core architectures, training objectives, and scaling strategies that determine how well our models learn from internet-scale video. We hire across levels — from senior to staff — and welcome both research-track and engineering-track candidates. What You'll Do - Design and train large-scale causal video generation models on web-scale video data - Develop and validate training objectives, model architectures, and data mixtures for video prediction at scale - Research scaling laws and data efficiency for web-scale video pretraining - Investigate what properties of web video transfer most effectively to robotic control and action prediction - Build systematic evaluations to measure video generation quality, long-horizon prediction fidelity, and downstream robot task performance - Run rigorous ablations and benchmarking to understand what drives model quality at scale - Collaborate closely with data & evaluation, post-training, and training systems teams to translate research ideas into worki

Free ATS check

Applying for this Research Scientist / Engineer - Video Generation Modeling role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Ashby

  • Ashby is a fast modern ATS — most applications take under 3 minutes.
  • The resume parser is strong; verify parsed experience dates and job titles.
  • Custom screening questions are often scored algorithmically — answer completely.
  • Location field affects geo-based screening; use your actual metro area.

ANONYMOUS · UNFILTERED

What do employees actually say about Rhoda AI?

Real rants from real employees. Read before you apply.

Read Company Rants →