AGI, INC.
AI
ResearchEngineerEvals
“Research Engineer - Evals at AGI, INC.. Skills: Eval harness, Agent eval, On-device performance. Build eval harness. Build eval suites”
What You'll Achieve.
Ship eval harness; Ship eval suites; Ship dashboards and tooling; Ship against eval bar; Catch a regression; Clear a launch; Shape research roadmap
Industry & Context.
Decide what better means; Make decisions on real signal
SF, in person
What They're Looking For.
Must Have
Experience with Python, Experience with machine learning frameworks, Knowledge of PostgreSQL, Proficient in AWS, Proficient in EC2, Proficient in S3, Proficient in Lambda, Proficient in Terraform, Proficient in Docker, Experience with Java, Experience with Spring Boot, Familiarity with Kafka, Familiarity with Redis, Experience with agent eval, Experience with tool use, Experience with long-horizon tasks, Experience with multilingual behavior, Experience with on-device perf trade-offs, Experience with QA at OEM scale, Experience with shipping consumer agents
Nice to Have
Kubernetes
What You'll Do.
Build dashboards and tooling
Set the bar for shipped
Protect bar from deadlines
Measure non-deterministic systems
How You'll Work.
Team & Collaboration
Work with product engineers; Work with partnerships
Applying for this Research Engineer - Evals role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Ashby
- Ashby is a fast modern ATS — most applications take under 3 minutes.
- The resume parser is strong; verify parsed experience dates and job titles.
- Custom screening questions are often scored algorithmically — answer completely.
- Location field affects geo-based screening; use your actual metro area.
ANONYMOUS · UNFILTERED
What do employees actually say about AGI, INC.?
Real rants from real employees. Read before you apply.