Company
AI startup
StaffEngineer-AgenticAI
Neural analysis suggests this role is
optimal for Senior candidates.
“Staff Engineer - Agentic AI. Skills: Agentic AI, LLM application architectures, Evaluation and benchmarking. Lead development of core agent intelligence layer. Execute multi-step workflows”
What You'll Achieve.
Drive agent task success rate; Ensure commercial viability; Improve completion metrics
Industry & Context.
Failure handling; Error recovery; Troubleshooting
What They're Looking For.
Must Have
7+ years software engineering, 2+ years building agentic LLM agents, Deep LLM application architectures experience, Agentic systems evaluation/benchmarking experience, Shipped AI systems with measurable outcomes, Python skills, Hands-on LLM tooling experience, Experience leading small technical team
Nice to Have
Desktop automation experience, COM experience, Programmatic control of applications experience, Mechanical engineering background, CAD/CAE background, PLM background, Enterprise deployment constraints familiarity, Published work in agentic AI, Open-source contributions in agentic AI, Experience building public benchmarks for AI agents
What You'll Do.
Lead development of core agent intelligence layer
Execute multi-step workflows
Serve as technical lead
Own full product loop
Define agent capabilities
Build agent implementations
Benchmark agent implementations
Drive agent task success rate
Define eval framework
Improve completion metrics
Set per-task token cost
Ensure commercial viability
Build evaluation infrastructure
Ground evals in user stories
Lead user story mapping
Validate user stories
Conduct direct interviews
Collaborate with domain experts
Translate user stories into evals
Close loop between research and benchmarking
Own agent architecture decisions
Write production code
Raise engineering standards
Collaborate cross-functionally
Align agent behavior with usage
How You'll Work.
Team & Collaboration
Small senior team; Cross-functionally with integrations; Cross-functionally with product; Cross-functionally with customers
Process & Methodology
Setting direction, Driving architecture decisions
Full Job Description
ABOUT THE ROLE A well-funded, early-stage AI startup in the mechanical engineering software space is looking for a Staff Engineer — Agentic AI to own the core agent intelligence layer that turns engineers' intent into reliable, cost-efficient multi-step workflows across complex desktop engineering tools. This is a high-impact, senior technical leadership role reporting directly to the CTO, sitting at the intersection of applied agentic AI, user research, and product delivery. The company serves Fortune 100 hardware engineering customers and is backed by notable investors. You'll join a small, senior team and have a direct line to executive leadership. The role is on-site in San Francisco, CA. WHAT YOU'LL DO - Lead development of the core agent intelligence layer executing multi-step workflows across complex desktop engineering software (CAD, CAE, PLM). - Report to the CTO and serve as technical lead for a small team of AI engineers, a user researcher, and domain expert contractors. - Own the full product loop: define agent capabilities from user stories, build implementations, and benchmark against real workflows. - Drive agent task success rate — define the eval framework, establish baselines, and systematically improve completion metrics. - Set and enforce per-task token budgets; track cost per completed workflow to ensure commercial viability. - Build rigorous, reproducible evaluation infrastructure grounded in validated user stories (SWE-bench-level rigor applied to engineering workflows). - Lead user story mapping and validation through direct interviews and collaboration with domain experts. - Translate validated user stories into testable evals, closing the loop between research and benchmarking. - Own agent architecture decisions: tool-calling strategies, state management, error recovery, model routing, and context management. - Act as a player-coach: write production code, review designs, unblock the team, and raise engineering standards. - Collaborate cros
Applying for this Staff Engineer - Agentic AI role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Ashby
- Ashby is a fast modern ATS — most applications take under 3 minutes.
- The resume parser is strong; verify parsed experience dates and job titles.
- Custom screening questions are often scored algorithmically — answer completely.
- Location field affects geo-based screening; use your actual metro area.
ANONYMOUS · UNFILTERED
What do employees actually say about this company?
Real rants from real employees. Read before you apply.