Mindrift

FreelanceAgentEvaluationEngineer

Glasgow, United Kingdom PART TIME Remote Friendly

The Brief

“Freelance Agent Evaluation Engineer at Mindrift. Skills: Python, Software development, Test automation, AI evaluation. Create challenging tasks. Define evaluation criteria”

What You'll Achieve.

Tasks meet acceptance criteria; Evaluate AI coding agents; Improve AI systems

Industry & Context.

Problems you'll solve

Reasoning about code; Analyze why agent failed; Design edge cases; Design adversarial scenarios

What They're Looking For.

Must Have

5+ years in software development, Python, FastAPI, pytest, async/await, subprocess, file operations, full-stack development, React-based interfaces, JavaScript/TypeScript, robust back-end systems, writing tests, Docker containers, Postgres, Kafka, Redis, CI/CD understanding, GitHub Actions, English proficiency - B2

What You'll Do.

Create challenging tasks

Define evaluation criteria

Build virtual companies

Assemble and calibrate tasks

Write tests for solutions

Iterate with AI agent on tests

Review code written by agents

Analyze agent failures/successes

Design adversarial scenarios

Iterate based on feedback

How You'll Work.

Team & Collaboration

Work with expert QA reviewers

Communication Scope

English proficiency - B2

Free ATS check

Applying for this Freelance Agent Evaluation Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

ANONYMOUS · UNFILTERED

What do employees actually say about Mindrift?

Real rants from real employees. Read before you apply.

Read Company Rants →