Mindrift

FreelanceAgentEvaluationEngineer

Incheon, Incheon, South Korea PART TIME Remote Friendly
The Brief

“Freelance Agent Evaluation Engineer at Mindrift. Skills: Python, Software Development, Test Automation, AI Evaluation. Create challenging tasks. Build virtual companies”

What You'll Achieve.

evaluate AI coding agents; tasks that challenge frontier models; tasks have many valid solutions; writing tests that accept all correct solutions; reject incorrect ones

Industry & Context.

Problems you'll solve

reasoning about code

What They're Looking For.

Must Have

Degree in Computer Science, Software Engineering, or related fields, 5+ years in software development, Python, FastAPI, pytest, async/await, subprocess, file operations, full-stack development, React-based interfaces, JavaScript/TypeScript, robust back-end systems, writing tests, functional tests, integration tests, Docker containers, infrastructure tools, Postgres, Kafka, Redis, CI/CD understanding, GitHub Actions, English proficiency - B2

Nice to Have

expert in every item

What You'll Do.

Create challenging tasks

Build virtual companies

Assemble and calibrate tasks

Iterate with AI agent

Review code written by agents

Analyze agent failures

Design adversarial scenarios

Iterate based on feedback

How You'll Work.

Team & Collaboration

Work with product team

Communication Scope

English proficiency - B2

Free ATS check

Applying for this Freelance Agent Evaluation Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

ANONYMOUS · UNFILTERED

What do employees actually say about Mindrift?

Real rants from real employees. Read before you apply.

Read Company Rants →