Humanoid
Technology
Internship-PerceptionandSpatialAI
Neural analysis suggests this role is
optimal for Entry candidates.
“Internship - Perception and Spatial AI at Humanoid. Skills: Perception, Spatial AI, Robotics. Develop perception systems. Work on Vision-Language(-Action) models”
What You'll Achieve.
Contribute to real systems
Industry & Context.
Problem-solving skills
What They're Looking For.
Must Have
Pursuing a degree in Computer Science, Machine Learning, Robotics, or related field, Foundations in machine learning, Foundations in computer vision, Hands-on experience with PyTorch, Training ML models, Experience running experiments, Interpreting results
Nice to Have
Interest in multimodal models, Interest in 3D vision, Interest in spatial reasoning, Interest in navigation, Interest in embodied AI
What You'll Do.
Develop perception systems
Work on Vision-Language(-Action) models
Work on multimodal models
Explore scene understanding
Explore 3D perception
Explore navigation methods
Apply methods to real systems
Integrate models into robotic platforms
Generate a 3D representation
Produce a geometrically coherent reconstruction
Produce a geometrically consistent reconstruction
Assign semantic labels in 3D
Align semantic predictions with geometry
How You'll Work.
Team & Collaboration
Collaborating with the team; Cross-functional teams
Full Job Description
Here at Humanoid, we believe in a future where robots amplify human potential. That’s why we’ve set out on a mission to build the world’s most capable, commercially-scalable, and safe humanoid robots. We’re bringing that mission to life with HMND‑01 Alpha - our rapidly developed humanoid platform now running in real industrial pilots - and we’re growing the team to take it even further. OUR MISSION We’re building software systems that enable robots to operate effectively in the real world expanding human capability and redefining how work gets done. THE OPPORTUNITY We’re looking for interns who are curious, proactive, and excited to work on real-world robotic systems. This is an open-ended internship, you won’t be confined to a single component, but will work across perception, navigation, and multimodal systems, collaborating closely with the team to find where you can have the most impact. You may work anywhere along the stack, from camera systems (timestamping, synchronization, validation), through perception and scene understanding, to navigation and integration with locomotion. The scope is intentionally broad. We’re looking for people who are excited to dive into unfamiliar areas and learn quickly. This is a full-time internship (5 days per week) over the summer (mid June - mid September), based in our London Paddington office, where you’ll contribute to real systems from early on with guidance and support from experienced researchers and engineers. Duration: 12 weeks | Start date: June | Compensation: Competitive pay + we'll keep you fed (seriously, the food is good) WHAT YOU MIGHT WORK ON - Develop perception systems for robot navigation and interaction in real-world environments - Work on focused problems within Vision-Language(-Action) or multimodal models (components, datasets, evaluation) - Run and analyse experiments using existing pipelines - Improve data quality through curation and labeling - Explore scene understanding, 3D perception, or navigation
Applying for this Internship - Perception and Spatial AI role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Ashby
- Ashby is a fast modern ATS — most applications take under 3 minutes.
- The resume parser is strong; verify parsed experience dates and job titles.
- Custom screening questions are often scored algorithmically — answer completely.
- Location field affects geo-based screening; use your actual metro area.
ANONYMOUS · UNFILTERED
What do employees actually say about Humanoid?
Real rants from real employees. Read before you apply.