Mindrift
Tech / AI / Software
FreelanceAgentEvaluationEngineer
“Freelance Agent Evaluation Engineer at Mindrift. Skills: Software development, Test automation, AI agent evaluation, Python development, Full-stack development, Writing tests. Create challenging tasks for AI coding agents. Define evaluation criteria for AI coding agents”
Industry & Context.
Reasoning about code across the stack; Understanding where models fail; Designing tasks that challenge frontier models
What They're Looking For.
Must Have
Degree in Computer Science, Software Engineering, or related fields, 5+ years in software development, Primarily Python (FastAPI, pytest, async/await, subprocess, file operations), Background in full-stack development, Experience building React-based interfaces (JavaScript/TypeScript), Experience building robust back-end systems, Experience writing tests (functional, integration), Docker containers, Familiarity with infrastructure tools (Postgres, Kafka, Redis), CI/CD understanding (GitHub Actions as a user: triggers, labels, reading results), English proficiency - B2
Nice to Have
Expert in every item is not required, but comfort reading and reasoning about code across the stack is expected.
What You'll Do.
Create challenging tasks for AI coding agents
Define evaluation criteria for AI coding agents
Build virtual companies with codebase
Assemble and calibrate tasks from intermediate states of virtual companies
Craft prompts for tasks
Design tasks set in isolated environments (emulations of a developer's workstation)
Write tests that accept all correct solutions and reject incorrect ones
Iterate with an AI agent on tests
Review code written by agents
Analyze why an agent failed or succeeded
Design edge cases and adversarial scenarios
Iterate based on feedback from expert QA reviewers
How You'll Work.
Team & Collaboration
Iterate based on feedback from expert QA reviewers
Communication Scope
English proficiency - B2
Applying for this Freelance Agent Evaluation Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
ANONYMOUS · UNFILTERED
What do employees actually say about Mindrift?
Real rants from real employees. Read before you apply.