Mindrift

AI

FreelanceAgentEvaluationEngineer

Gdańsk, Pomeranian Voivodeship, Poland PART TIME Remote Friendly
The Brief

“Freelance Agent Evaluation Engineer at Mindrift. Skills: Python, Software Development, Testing, AI Evaluation. Create challenging tasks. Build virtual companies”

What You'll Achieve.

evaluate AI coding agents; testing AI systems; evaluating AI systems; improving AI systems; tasks must be submitted by deadline; meet listed acceptance criteria

Industry & Context.

AI
Problems you'll solve

reasoning about code; deeply understand where models fail; identify scenarios revealing differences

What They're Looking For.

Must Have

Degree in Computer Science, Software Engineering, or related fields, 5+ years in software development, Python, FastAPI, pytest, async/await, subprocess, file operations, full-stack development, React-based interfaces, JavaScript/TypeScript, robust back-end systems, writing tests, functional tests, integration tests, Docker containers, infrastructure tools, Postgres, Kafka, Redis, CI/CD understanding, GitHub Actions, English proficiency - B2

Nice to Have

expert in every item

What You'll Do.

Create challenging tasks

Build virtual companies

Assemble and calibrate tasks

Iterate with AI agent

Review code written by agents

Analyze agent performance

Design adversarial scenarios

Iterate based on feedback

How You'll Work.

Team & Collaboration

Work with product team

Communication Scope

English proficiency - B2

Free ATS check

Applying for this Freelance Agent Evaluation Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

ANONYMOUS · UNFILTERED

What do employees actually say about Mindrift?

Real rants from real employees. Read before you apply.

Read Company Rants →