Rhoda AI

Technology

ResearchScientist/Engineer-Post-training&RobotLearning

$175–250k ~AI est. Mountain View, California, United States FULL TIME

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Senior candidates.

The Brief

“Research Scientist / Engineer - Post-training & Robot Learning at Rhoda AI. Skills: Robot learning, Post-training, Policy learning, Reinforcement learning. Design RL training pipelines. Improve robot policy performance”

Industry & Context.

Technology

Problems you'll solve

Root cause analysis; Troubleshooting

What They're Looking For.

Must Have

Hands-on experience with robot systems, Robotic policy learning experience, Autonomous systems experience, Understanding of robot policy learning, Practical familiarity with real robot hardware, Solid ML skills, Hands-on PyTorch experience, Ability to diagnose policy failures, Reason about distribution shift, Iterate effectively on data and training strategies, Comfort with ambiguity, Fast-changing research priorities

Nice to Have

Hands-on experience with reinforcement learning, Prior industry experience in robotics, Prior industry experience in autonomous driving, Prior industry experience in physical AI, Experience with teleoperation systems, Robot demonstration collection at scale experience, Familiarity with robot middleware, Familiarity with real-time control systems, Experience with simulation environments for robotics, Understanding of video generation models, PhD in Robotics, PhD in ML, Publication record at ICRA, Publication record at CoRL, Publication record at RSS, Publication record at NeurIPS

What You'll Do.

Design RL training pipelines

Improve robot policy performance

Develop RL algorithms

Design post-training pipelines

Implement post-training pipelines

Work on inverse dynamics model

Translate video predictions

Build evaluation frameworks

Research adaptation methods

Adapt models to new tasks

Identify failure modes

Drive targeted improvements

Iterate between simulation and real robot

Close the feedback loop

Surface missing capabilities

How You'll Work.

Team & Collaboration

Pre-training team

Full Job Description

At Rhoda AI, we’re building the next generation of generalist intelligent robots. We own the full robotics stack from high-performance hardware and robot systems to the infrastructure and state-of-the-art foundation world models that control our robots. Our robots are designed to be generalists capable of operating in complex, real-world environments and handling long-tail edge cases, made possible by our cutting edge research and end-to-end system design. We've raised over $400M and are investing aggressively in model research, infrastructure, hardware development, and manufacturing scale-up to make generalist robotics a reality. We're looking for Research Scientists and Research Engineers with deep robotics or autonomous systems domain knowledge to adapt our web-pretrained video model to real robot tasks. Post-training at Rhoda means taking a causal video generation model pretrained on internet-scale data and fine-tuning it on robot-collected demonstrations to produce reliable, generalizable behavior — with as little task-specific data as possible. We hire across levels — from senior to staff. What You'll Do - Design and implement RL training pipelines to improve robot policy performance beyond what imitation learning alone achieves — reward design, online data collection, and policy optimization - Develop and apply RL algorithms (PPO, GRPO, or similar) adapted to the video prediction setting, including reward modeling and feedback collection strategies for physical task performance - Design and implement broader post-training pipelines: supervised fine-tuning, preference optimization, and behavioral alignment on robot-collected demonstration data - Work on the inverse dynamics model that translates video predictions into executable robot actions - Build evaluation frameworks for post-trained policies: task success, generalization to novel objects and environments, and failure mode analysis on real hardware - Research methods to efficiently adapt models to new tas

Free ATS check

Applying for this Research Scientist / Engineer - Post-training & Robot Learning role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 24 detected · ranked by frequency

Reinforcement learning ×3

Policy optimization ×3

Reward design ×3

Online data collection ×3

Inverse dynamics model ×3

Task success evaluation ×3

Generalization analysis ×3

Failure mode analysis ×3

Few-shot adaptation ×3

Robot learning ×2

Post-training ×2

Policy learning ×2

PyTorch

Imitation learning

Behavior cloning

Video generation models

Action prediction

System design

Research strategy

Technical direction

ROS/ROS2

MuJoCo

Isaac Sim

Genesis

BEHAVIOURAL

Leadership

Role Details

Seniority Senior

Experience 5–10 yrs

Level Senior

Type FULL TIME

Education PhD

Category software

Salary Band 150k-200k

AI-Extracted Insights

Domain Areas

roboticsautonomous-systemsphysical-airobot-policy-learningvideo-predictionreal-robot-hardwarereal-time-control-systemsrobot-middleware

How to Apply on Ashby

Ashby is a fast modern ATS — most applications take under 3 minutes.
The resume parser is strong; verify parsed experience dates and job titles.
Custom screening questions are often scored algorithmically — answer completely.
Location field affects geo-based screening; use your actual metro area.

ANONYMOUS · UNFILTERED

What do employees actually say about Rhoda AI?

Real rants from real employees. Read before you apply.

Read Company Rants →