Company
SaaS
StaffEngineer—AgenticAI
Neural analysis suggests this role is
optimal for Senior candidates.
“Staff Engineer — Agentic AI. Skills: Agentic AI, LLM application architecture, Technical leadership. Lead development of core agent intelligence layer. Execute multi-step workflows”
What You'll Achieve.
Drive agent task success rate; Ensure commercial viability
Industry & Context.
Error recovery; Failure handling
What They're Looking For.
Must Have
7+ years software engineering, 2+ years building agentic LLM systems, Deep LLM application architecture experience, Agentic systems evaluation/benchmarking experience, Shipped AI systems with measurable outcomes, Proficiency in Python, Experience leading technical team
Nice to Have
Experience with desktop automation, Background in mechanical engineering, Familiarity with enterprise deployment constraints, Published work in agentic AI, Experience building public benchmarks
What You'll Do.
Lead development of core agent intelligence layer
Execute multi-step workflows
Own full product loop
Define agent capabilities
Build agent implementations
Benchmark against real workflows
Drive agent task success rate
Define evaluation framework
Improve completion metrics
Set token cost per workflow
Ensure commercial viability
Build reproducible evaluation infrastructure
Validate user stories
Translate user stories into evals
Close loop between research and benchmarking
Own agent architecture decisions
Set technical direction
Review architecture decisions
Raise engineering bar
Collaborate cross-functionally
Align agent behavior with usage
How You'll Work.
Team & Collaboration
Cross-functionally with integrations; With product; With customers
Process & Methodology
Roadmap planning
Full Job Description
ABOUT THE ROLE A well-funded, early-stage B2B SaaS company building AI agent infrastructure for mechanical engineering workflows is hiring a Staff Engineer — Agentic AI to own the core agent intelligence layer. This is a high-impact, senior technical leadership role reporting directly to the CTO. You'll sit at the intersection of applied agentic AI, user research, and product delivery — determining real-world value for Fortune 100 enterprise customers in the CAD, CAE, and PLM space. You'll lead a small team of AI engineers, a user researcher, and domain expert contractors, acting as a player-coach who writes production code and sets technical direction. WHAT YOU'LL DO - Lead development of the core agent intelligence layer that executes multi-step workflows across complex desktop engineering software. - Own the full product loop: define agent capabilities from user stories, build implementations, and benchmark against real workflows. - Drive agent task success rate — define the evaluation framework, establish baselines, and systematically improve completion metrics. - Set and enforce per-task token budgets; track cost per completed workflow to ensure commercial viability. - Build rigorous, reproducible evaluation infrastructure grounded in validated user stories (SWE-bench-level rigor applied to engineering workflows). - Lead user story mapping and validation through interviews and close collaboration with domain experts. - Translate validated user stories into testable evals and close the loop between research and benchmarking. - Own agent architecture decisions: tool-calling strategies, state management, error recovery, model routing, and context management. - Set technical direction, review architecture decisions, unblock the team, and raise the engineering bar across a team of 3–6 engineers. - Collaborate cross-functionally with integrations, product, and customers during POCs to align agent behavior with real-world usage. WHAT WE'RE LOOKING FOR Must-haves: - 7+
Applying for this Staff Engineer — Agentic AI role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Ashby
- Ashby is a fast modern ATS — most applications take under 3 minutes.
- The resume parser is strong; verify parsed experience dates and job titles.
- Custom screening questions are often scored algorithmically — answer completely.
- Location field affects geo-based screening; use your actual metro area.
ANONYMOUS · UNFILTERED
What do employees actually say about this company?
Real rants from real employees. Read before you apply.