Amazon.com Services LLC

Research Science, Applied Science, subsidiaries

AppliedScientist

$172–223k New York, New York, United States FULL TIME

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid+ candidates.

The Brief

“Applied Scientist at Amazon.com Services LLC. Skills: Robotics, Machine learning, Evaluation systems. Design evaluation frameworks. Implement evaluation frameworks”

Industry & Context.

Research Science, Applied Science, subsidiaries

Problems you'll solve

Identify performance gaps; Identify failure modes

What They're Looking For.

Must Have

3+ years building models, PhD or Master's degree, 4+ years CS, CE, ML, Experience programming Java, C++, Python, Experience algorithms and data structures, Experience parsing, Experience numerical optimization, Experience data mining, Experience parallel and distributed computing, Experience high-performance computing

Nice to Have

Experience using Unix/Linux

What You'll Do.

Design evaluation frameworks

Implement evaluation frameworks

Develop task definitions

Develop success criteria

Develop benchmarking methodologies

Create data collection protocols

Refine data collection protocols

Build teleoperation workflows

Build operator interfaces

Analyze evaluation results

Analyze collected data

Identify performance gaps

Identify failure modes

Identify opportunities for data collection

Collaborate with engineering teams

Integrate evaluation tooling

Integrate logging systems

Integrate data pipelines

Stay current with advances

Lead technical projects

Mentor junior scientists

Mentor junior engineers

How You'll Work.

Team & Collaboration

Engineering teams; Robotics stack

Process & Methodology

Technical projects

Full Job Description

We are seeking an Applied Scientist to lead the development of evaluation frameworks and data collection protocols for robotic capabilities. In this role, you will focus on designing how we measure, stress-test, and improve robot behavior across a wide range of real-world tasks. Your work will play a critical role in shaping how policies are validated and how high-quality datasets are generated to accelerate system performance. You will operate at the intersection of robotics, machine learning, and human-in-the-loop systems, building the infrastructure and methodologies that connect teleoperation, evaluation, and learning. This includes developing evaluation policies, defining task structures, and contributing to operator-facing interfaces that enable scalable and reliable data collection. The ideal candidate is highly experimental, systems-oriented, and comfortable working across software, robotics, and data pipelines, with a strong focus on turning ambiguous capability goals into measurable and actionable evaluation systems. Key job responsibilities - Design and implement evaluation frameworks to measure robot capabilities across structured tasks, edge cases, and real-world scenarios - Develop task definitions, success criteria, and benchmarking methodologies that enable consistent and reproducible evaluation of policies - Create and refine data collection protocols that generate high-quality, task-relevant datasets aligned with model development needs - Build and iterate on teleoperation workflows and operator interfaces to support efficient, reliable, and scalable data collection - Analyze evaluation results and collected data to identify performance gaps, failure modes, and opportunities for targeted data collection - Collaborate with engineering teams to integrate evaluation tooling, logging systems, and data pipelines into the broader robotics stack - Stay current with advances in robotics, evaluation methodologies, and human-in-the-loop learning to continuou

Free ATS check

Applying for this Applied Scientist role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

ANONYMOUS · UNFILTERED

What do employees actually say about Amazon.com Services LLC?

Real rants from real employees. Read before you apply.

Read Company Rants →