Anthropic

Technology

ProductManager,ClaudeCodeModelPerformance

$175–250k ~AI est. San Francisco, California, United States Remote Friendly

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Manager candidates.

The Brief

“Product Manager, Claude Code Model Performance at Anthropic. Skills: Product management, AI models, Model evaluation. Drive model launches end-to-end. Build evals that measure what matters”

What You'll Achieve.

Extract maximum performance from models; Shipped improvements

Industry & Context.

Technology

Problems you'll solve

Systems thinker

What They're Looking For.

Must Have

2+ years in product management, Bachelor's degree or equivalent experience

Nice to Have

Engineering background

What You'll Do.

Drive model launches end-to-end

Build evals that measure what matters

Translate model improvements into outcomes

Own model launch planning

Define readiness criteria

Coordinate across research and engineering

Ensure launches land cleanly

Implement agentic evals

Measure real-world coding performance

Partner with researchers

Define target behaviors

Influence model development

Understand capability gaps

Turn research progress into improvements

Create clear priorities

How You'll Work.

Team & Collaboration

Partner with researchers; Partner with product engineers; Cross-functional collaboration

Process & Methodology

Roadmap planning

Full Job Description

About Anthropic Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. About the role As a Product Manager on Claude Code's model performance team, you will drive model launches end-to-end, build evals that measure what matters, and partner directly with researchers and product engineers to translate model improvements into developer-facing outcomes. Claude Code is the most capable coding agent in the world but there’s much more we can do to extract the maximum performance from our models. We're looking for a PM who has personally built agentic evals, thinks in systems, uses Claude Code every day, and has refined model taste. You should be as comfortable influencing our research team as you are getting in the weeds of transcripts. You will be the connective tissue between frontier research and the millions of developers who depend on Claude Code to do their best work. Responsibilities Own model launch planning and execution for Claude Code: define readiness criteria, coordinate across research and product engineering, and ensure launches land cleanly with developers Design and implement agentic evals that measure real-world coding performance Drive the engineering team's eval roadmap Partner with researchers working on coding capabilities to define target behaviors and influence model development with evidence from real usage Talk with users and analyze transcripts to understand capability gaps and turn research progress into shipped improvements Synthesize signal from internal users, external developers, and competitive benchmarks into clear priorities You might be a good fit if you Have personally built agentic evals (e. g. SWE-bench-style task suites) Are a daily Claude Code user and can articulate wh

Free ATS check

Applying for this Product Manager, Claude Code Model Performance role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Greenhouse

Create a Greenhouse profile before applying — it saves time across multiple applications.
Upload your resume as a PDF; the parser handles it better than Word.
Answer all knockout questions carefully — wrong answers auto-reject before a human sees you.
Enable email notifications to track application status in real time.

ANONYMOUS · UNFILTERED

What do employees actually say about Anthropic?

Real rants from real employees. Read before you apply.

Read Company Rants →