Rhoda AI
Technology
ResearchScientist/Engineer-Post-training&RobotLearning
Neural analysis suggests this role is
optimal for Senior candidates.
“Research Scientist / Engineer - Post-training & Robot Learning at Rhoda AI. Skills: Robot learning, Post-training, Policy learning, Reinforcement learning. Design RL training pipelines. Improve robot policy performance”
Industry & Context.
Root cause analysis; Troubleshooting
What They're Looking For.
Must Have
Hands-on experience with robot systems, Robotic policy learning experience, Autonomous systems experience, Understanding of robot policy learning, Practical familiarity with real robot hardware, Solid ML skills, Hands-on PyTorch experience, Ability to diagnose policy failures, Reason about distribution shift, Iterate effectively on data and training strategies, Comfort with ambiguity, Fast-changing research priorities
Nice to Have
Hands-on experience with reinforcement learning, Prior industry experience in robotics, Prior industry experience in autonomous driving, Prior industry experience in physical AI, Experience with teleoperation systems, Robot demonstration collection at scale experience, Familiarity with robot middleware, Familiarity with real-time control systems, Experience with simulation environments for robotics, Understanding of video generation models, PhD in Robotics, PhD in ML, Publication record at ICRA, Publication record at CoRL, Publication record at RSS, Publication record at NeurIPS
What You'll Do.
Design RL training pipelines
Improve robot policy performance
Develop RL algorithms
Design post-training pipelines
Implement post-training pipelines
Work on inverse dynamics model
Translate video predictions
Build evaluation frameworks
Research adaptation methods
Adapt models to new tasks
Identify failure modes
Drive targeted improvements
Iterate between simulation and real robot
Close the feedback loop
Surface missing capabilities
How You'll Work.
Team & Collaboration
Pre-training team
Full Job Description
At Rhoda AI, we’re building the next generation of generalist intelligent robots. We own the full robotics stack from high-performance hardware and robot systems to the infrastructure and state-of-the-art foundation world models that control our robots. Our robots are designed to be generalists capable of operating in complex, real-world environments and handling long-tail edge cases, made possible by our cutting edge research and end-to-end system design. We've raised over $400M and are investing aggressively in model research, infrastructure, hardware development, and manufacturing scale-up to make generalist robotics a reality. We're looking for Research Scientists and Research Engineers with deep robotics or autonomous systems domain knowledge to adapt our web-pretrained video model to real robot tasks. Post-training at Rhoda means taking a causal video generation model pretrained on internet-scale data and fine-tuning it on robot-collected demonstrations to produce reliable, generalizable behavior — with as little task-specific data as possible. We hire across levels — from senior to staff. What You'll Do - Design and implement RL training pipelines to improve robot policy performance beyond what imitation learning alone achieves — reward design, online data collection, and policy optimization - Develop and apply RL algorithms (PPO, GRPO, or similar) adapted to the video prediction setting, including reward modeling and feedback collection strategies for physical task performance - Design and implement broader post-training pipelines: supervised fine-tuning, preference optimization, and behavioral alignment on robot-collected demonstration data - Work on the inverse dynamics model that translates video predictions into executable robot actions - Build evaluation frameworks for post-trained policies: task success, generalization to novel objects and environments, and failure mode analysis on real hardware - Research methods to efficiently adapt models to new tas
Applying for this Research Scientist / Engineer - Post-training & Robot Learning role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Ashby
- Ashby is a fast modern ATS — most applications take under 3 minutes.
- The resume parser is strong; verify parsed experience dates and job titles.
- Custom screening questions are often scored algorithmically — answer completely.
- Location field affects geo-based screening; use your actual metro area.
ANONYMOUS · UNFILTERED
What do employees actually say about Rhoda AI?
Real rants from real employees. Read before you apply.