Nuro

TechnicalLead,EvaluationInfrastructure

$194–291k Mountain View, California, United States

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Lead candidates.

The Brief

“Technical Lead, Evaluation Infrastructure at Nuro. Skills: Evaluation Infrastructure, AI, ML, Python. Mentor and grow team. Champion AI-native engineering practices”

What You'll Achieve.

enable L4 driverless deployment; shorten time-to-signal; shorten time-to-confidence; meet high SLAs

What They're Looking For.

Must Have

Python, C++, AI-native mindset, Claude Code, Cursor

Nice to Have

data engineering, batch and streaming data processing, warehousing, analytics solutions, data workflow orchestration platforms, evaluation platforms, validation platforms, analytics platforms, autonomy, robotics, safety-critical systems

What You'll Do.

Champion AI-native engineering practices

Build metrics framework

Build evaluation pipelines

Build introspection tooling

Build analysis products

Empower autonomy and Systems

How You'll Work.

Team & Collaboration

Partner with Product; Partner with Autonomy; Partner with Systems; Partner with PMs; Partner with engineers; Partner with cross-functional stakeholders

Communication Scope

clear, concise communicator

Full Job Description

Who We Are Nuro is a self-driving technology company on a mission to make autonomy accessible to all. Founded in 2016, Nuro is building the world’s most scalable driver, combining cutting-edge AI with automotive-grade hardware. Nuro licenses its core technology, the Nuro Driver™, to support a wide range of applications, from robotaxis and commercial fleets to personally owned vehicles. With technology proven over years of self-driving deployments, Nuro gives the automakers and mobility platforms a clear path to AVs at commercial scale, empowering a safer, richer, and more connected future. About the Role Evaluation Infrastructure plays a critical role at Nuro, directly enabling L4 driverless deployment. The team supports two demanding workloads: day-to-day Autonomy Evaluation that powers rapid software iteration, and large-scale Driverless Safety Validation that produces the rigorous evidence required to deploy autonomy on public roads. The Evaluation Infrastructure team builds the metrics framework, evaluation pipelines, introspection tooling, and analysis products that turn raw on-road and simulation logs into actionable insight. Our metrics stack spans both heuristic and ML-based approaches, covering everything from low-level component accuracy to end-to-end behavior quality. The platform empowers autonomy and Systems invest in the scale, reliability, and CI/CD of the evaluation stack to shorten time-to-signal for evaluation and time-to-confidence for validation, and to meet high SLAs for downstream stakeholders Mentor and grow the Evaluation Infrastructure team, and champion AI-native engineering practices that compound team velocity and code quality Partner with Product, Autonomy, Systems a clear, concise communicator who partners effectively with PMs, engineers, and cross-functional stakeholders across Autonomy, Systems sets the technical bar for metric quality, pipeline rigor, and safety-critical engineering practice across the broader software organization;

Free ATS check

Applying for this Technical Lead, Evaluation Infrastructure role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

ANONYMOUS · UNFILTERED

What do employees actually say about Nuro?

Real rants from real employees. Read before you apply.

Read Company Rants →