Rhoda AI

Technology

ResearchEngineer-DataInfrastructure

$185–265k ~AI est. Palo Alto, California, United States FULL TIME
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Senior candidates.

The Brief

“Research Engineer - Data Infrastructure at Rhoda AI. Skills: Data Infrastructure, Distributed Systems, ML Infrastructure, Data Pipelines. Architect data infrastructure. Build data infrastructure”

Industry & Context.

Technology
Problems you'll solve

Performance optimization; Cost-performance tradeoffs

What They're Looking For.

Must Have

5+ years of experience in data infrastructure, 5+ years of experience in distributed systems, 5+ years of experience in ML infrastructure, Experience building large-scale data pipelines, Experience operating large-scale data pipelines, Deep understanding of distributed systems, Deep understanding of databases, Deep understanding of indexing strategies, Deep understanding of cloud storage architectures, Experience optimizing data throughput, Experience with workload balancing, Experience with cost-performance tradeoffs, Experience with distributed compute frameworks, Skills in observability, Skills in monitoring, Skills in production reliability, Software engineering fundamentals, Ability to own systems end-to-end

Nice to Have

1B+ samples data pipelines experience, Petabyte-scale systems experience, Experience managing large multimodal datasets, Familiarity with ML training workflows, Familiarity with data lifecycle management, Familiarity with vision-language models, Experience running ML inference workloads, Experience with robotics data formats, Experience with real-world sensor data, Experience with data warehouse technologies, Familiarity with data versioning tooling, Familiarity with data lineage tooling

What You'll Do.

Architect data infrastructure

Build data infrastructure

Scale data infrastructure

Design storage systems

Optimize storage systems

Build indexing systems

Build retrieval systems

Support dataset querying

Support dataset filtering

Support dataset iteration

Develop observability frameworks

Implement workload balancing

Implement throughput optimization

Manage data artifacts

Manage data versioning

Build internal interfaces

Build lightweight tools

Enable data exploration

Support integration of VLMs

Support scalable deployment of VLMs

How You'll Work.

Team & Collaboration

Collaboration with researchers; Collaboration with engineers

Full Job Description

At Rhoda AI, we’re building the next generation of generalist intelligent robots. We own the full robotics stack from high-performance hardware and robot systems to the infrastructure and state-of-the-art foundation world models that control our robots. Our robots are designed to be generalists capable of operating in complex, real-world environments and handling long-tail edge cases, made possible by our cutting edge research and end-to-end system design. We've raised over $400M and are investing aggressively in model research, infrastructure, hardware development, and manufacturing scale-up to make generalist robotics a reality. We're looking for Data Infrastructure MLEs to scale the systems that power our model training data pipeline, from raw ingestion and storage to indexing, retrieval, and throughput optimization at massive scale. We hire across levels — from senior to staff. What You'll Do - Architect, build, and scale a high-throughput data infrastructure that processes and manages billions of video clips with strong guarantees around reliability, latency, and cost efficiency - Design and optimize large-scale storage systems (cloud object storage, databases, metadata stores) for multimodal datasets - Build efficient indexing and retrieval systems to support fast dataset querying, filtering, and iteration for research and production use cases - Develop observability frameworks for data pipelines including monitoring, alerting, failure recovery, and performance optimization - Implement intelligent workload balancing and throughput optimization across distributed compute and storage systems - Manage data artifacts, versioning, and lineage to ensure reproducibility and traceability across training runs - Build internal interfaces and lightweight tools that enable researchers and engineers to explore, query, and analyze large datasets at scale - Support integration and scalable deployment of vision-language models (VLMs) within data pipelines for screening, enric

Free ATS check

Applying for this Research Engineer - Data Infrastructure role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Ashby

  • Ashby is a fast modern ATS — most applications take under 3 minutes.
  • The resume parser is strong; verify parsed experience dates and job titles.
  • Custom screening questions are often scored algorithmically — answer completely.
  • Location field affects geo-based screening; use your actual metro area.

ANONYMOUS · UNFILTERED

What do employees actually say about Rhoda AI?

Real rants from real employees. Read before you apply.

Read Company Rants →