Company

Engineering

StaffEngineer—AgenticAI

$220–330k ~AI est. San Francisco, California, United States FULL TIME

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Senior candidates.

The Brief

“Staff Engineer — Agentic AI. Skills: Agentic AI, LLM application architectures, Evaluation frameworks. Lead development of core agent intelligence. Execute multi-step workflows”

What You'll Achieve.

Ensure commercial viability; Improve completion metrics

Industry & Context.

Engineering

Problems you'll solve

Error recovery

What They're Looking For.

Must Have

7+ years in software engineering, 2 years building agentic LLM-based agents, Python skills, Experience with desktop automation

Nice to Have

Domain experience in mechanical engineering, Experience in CAD/CAE, Experience in PLM, Experience in adjacent industries, Understanding of enterprise deployment constraints, Track record contributing to public benchmarks, Track record contributing to publications, Track record contributing to open-source agentic AI projects

What You'll Do.

Lead development of core agent intelligence

Execute multi-step workflows

Own full product loop

Define agent capabilities

Build agent implementations

Benchmark agent implementations

Drive agent task success rate

Define evaluation frameworks

Iterate to improve metrics

Set per-task token budgets

Track cost per workflow

Build reproducible evaluation infrastructure

Lead user story mapping

Validate user stories

Collaborate with domain experts

Translate user stories

Create testable evals

Close loop between research and benchmarking

Own agent architecture decisions

Write production code

Raise engineering standards

Collaborate cross-functionally

How You'll Work.

Team & Collaboration

Cross-functionally with integrations; Cross-functionally with product; Cross-functionally with customers

Full Job Description

ABOUT THE ROLE We're hiring a senior technical leader to own the core agent intelligence that turns engineers' intent into reliable, cost-efficient multi-step workflows across desktop engineering tools. This role sits at the intersection of applied agentic AI, user research, and product delivery and will determine the product's real-world value to enterprise customers. You'll report to the CTO and serve as technical lead for a small team of AI engineers, a user researcher, and domain expert contractors in an early-stage, high-impact environment (Series A, Fortune 100 customers, direct line to leadership). WHAT YOU'LL DO - Lead development of the core agent intelligence layer that executes multi-step workflows across complex desktop engineering software. - Own the full product loop: define agent capabilities from user stories, build implementations, and benchmark against real workflows. - Drive agent task success rate by defining evaluation frameworks, establishing baselines, and iterating to improve completion metrics. - Set and enforce per-task token budgets and track cost per completed workflow to ensure commercial viability. - Build rigorous, reproducible evaluation infrastructure grounded in validated user stories. - Lead user story mapping and validation through interviews and close collaboration with domain experts. - Translate validated user stories into testable evals and close the loop between research and benchmarking. - Own agent architecture decisions including tool-calling, state management, error recovery, model routing, and context management. - Act as a player-coach: write production code, review designs, unblock the team, and raise engineering standards. - Collaborate cross-functionally with integrations, product, and customers during POCs to align agent behavior with real-world usage. WHAT WE'RE LOOKING FOR - 7+ years in software engineering, including at least 2 years building agentic LLM-based agents that act in the real world. - Deep experience

Free ATS check

Applying for this Staff Engineer — Agentic AI role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 21 detected · ranked by frequency

Evaluation frameworks ×8

Agentic AI ×5

LLM application architectures ×5

Benchmarking frameworks ×3

Cost efficiency ×3

Failure modes ×3

Function calling ×3

Tool APIs ×3

Observability ×3

Tracing ×3

Desktop automation ×3

Programmatic control ×3

LLM

Python

COM

User research

Product delivery

Technical leadership

System design

Orchestration patterns

LLM tooling

BEHAVIOURAL

Leadership

Role Details

Experience 7–10 yrs

Level Senior

Type FULL TIME

Category engineering

Salary Band 200k+

AI-Extracted Insights

Domain Areas

agentic-aillm-application-architecturesdesktop-automationmechanical-engineeringcad-caeplmenterprise-deployment-constraints

How to Apply on Ashby

Ashby is a fast modern ATS — most applications take under 3 minutes.
The resume parser is strong; verify parsed experience dates and job titles.
Custom screening questions are often scored algorithmically — answer completely.
Location field affects geo-based screening; use your actual metro area.

ANONYMOUS · UNFILTERED

What do employees actually say about this company?

Real rants from real employees. Read before you apply.

Read Company Rants →