Rhoda AI
Technology
ResearchEngineer-DataInfrastructure
Neural analysis suggests this role is
optimal for Senior candidates.
“Research Engineer - Data Infrastructure at Rhoda AI. Skills: Data Infrastructure, Distributed Systems, ML Infrastructure, Data Pipelines. Architect data infrastructure. Build data infrastructure”
Industry & Context.
Performance optimization; Cost-performance tradeoffs
What They're Looking For.
Must Have
5+ years of experience in data infrastructure, 5+ years of experience in distributed systems, 5+ years of experience in ML infrastructure, Experience building large-scale data pipelines, Experience operating large-scale data pipelines, Deep understanding of distributed systems, Deep understanding of databases, Deep understanding of indexing strategies, Deep understanding of cloud storage architectures, Experience optimizing data throughput, Experience with workload balancing, Experience with cost-performance tradeoffs, Experience with distributed compute frameworks, Skills in observability, Skills in monitoring, Skills in production reliability, Software engineering fundamentals, Ability to own systems end-to-end
Nice to Have
1B+ samples data pipelines experience, Petabyte-scale systems experience, Experience managing large multimodal datasets, Familiarity with ML training workflows, Familiarity with data lifecycle management, Familiarity with vision-language models, Experience running ML inference workloads, Experience with robotics data formats, Experience with real-world sensor data, Experience with data warehouse technologies, Familiarity with data versioning tooling, Familiarity with data lineage tooling
What You'll Do.
Architect data infrastructure
Build data infrastructure
Scale data infrastructure
Design storage systems
Optimize storage systems
Build indexing systems
Build retrieval systems
Support dataset querying
Support dataset filtering
Support dataset iteration
Develop observability frameworks
Implement workload balancing
Implement throughput optimization
Manage data artifacts
Manage data versioning
Build internal interfaces
Build lightweight tools
Enable data exploration
Support integration of VLMs
Support scalable deployment of VLMs
How You'll Work.
Team & Collaboration
Collaboration with researchers; Collaboration with engineers
Full Job Description
At Rhoda AI, we’re building the next generation of generalist intelligent robots. We own the full robotics stack from high-performance hardware and robot systems to the infrastructure and state-of-the-art foundation world models that control our robots. Our robots are designed to be generalists capable of operating in complex, real-world environments and handling long-tail edge cases, made possible by our cutting edge research and end-to-end system design. We've raised over $400M and are investing aggressively in model research, infrastructure, hardware development, and manufacturing scale-up to make generalist robotics a reality. We're looking for Data Infrastructure MLEs to scale the systems that power our model training data pipeline, from raw ingestion and storage to indexing, retrieval, and throughput optimization at massive scale. We hire across levels — from senior to staff. What You'll Do - Architect, build, and scale a high-throughput data infrastructure that processes and manages billions of video clips with strong guarantees around reliability, latency, and cost efficiency - Design and optimize large-scale storage systems (cloud object storage, databases, metadata stores) for multimodal datasets - Build efficient indexing and retrieval systems to support fast dataset querying, filtering, and iteration for research and production use cases - Develop observability frameworks for data pipelines including monitoring, alerting, failure recovery, and performance optimization - Implement intelligent workload balancing and throughput optimization across distributed compute and storage systems - Manage data artifacts, versioning, and lineage to ensure reproducibility and traceability across training runs - Build internal interfaces and lightweight tools that enable researchers and engineers to explore, query, and analyze large datasets at scale - Support integration and scalable deployment of vision-language models (VLMs) within data pipelines for screening, enric
Applying for this Research Engineer - Data Infrastructure role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Ashby
- Ashby is a fast modern ATS — most applications take under 3 minutes.
- The resume parser is strong; verify parsed experience dates and job titles.
- Custom screening questions are often scored algorithmically — answer completely.
- Location field affects geo-based screening; use your actual metro area.
ANONYMOUS · UNFILTERED
What do employees actually say about Rhoda AI?
Real rants from real employees. Read before you apply.