Mindrift

FreelanceAgentEvaluationEngineer

Córdoba, Córdoba Province, Argentina PART TIME Remote Friendly

The Brief

“Freelance Agent Evaluation Engineer at Mindrift. Skills: Python, Software development, Test automation, AI evaluation. Create challenging tasks. Define evaluation criteria”

What You'll Achieve.

Improve AI systems; Evaluate AI coding agents; Verify tests catch real problems; Ensure tests don't miss bad solutions; Ensure tests don't break on good solutions

Industry & Context.

Problems you'll solve

Analyze why an agent failed or succeeded; Deeply understand where models fail

What They're Looking For.

Must Have

5+ years in software development, Python, FastAPI, pytest, async/await, subprocess, file operations, full-stack development, React-based interfaces, JavaScript/TypeScript, robust back-end systems, writing tests, Docker containers, CI/CD understanding, English proficiency - B2

Nice to Have

infrastructure tools, Postgres, Kafka, Redis, GitHub Actions

What You'll Do.

Create challenging tasks

Define evaluation criteria

Build virtual companies

Assemble and calibrate tasks

Ensure task solvability

Design tasks in isolated environments

Iterate with AI agent on tests

Review code written by agents

Analyze agent failures/successes

Design adversarial scenarios

Iterate based on feedback

How You'll Work.

Team & Collaboration

Work with expert QA reviewers

Communication Scope

English proficiency

Free ATS check

Applying for this Freelance Agent Evaluation Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

Skill Signal 26 detected

Core

Software development ×5

Test automation ×5

Docker ×5

Python ×4

Required

Full-stack development ×3

Backend systems development ×3

Frontend development ×3

Writing tests ×3

CI/CD ×3

Infrastructure tools ×3

AI evaluation ×2

FastAPI ×2

pytest ×2

Postgres ×2

Nice to have

async/await

subprocess

file operations

React

JavaScript

TypeScript

AI systems evaluation

AI agent evaluation

AI coding agents

Role Details

Seniority

mid

Work Mode

Remote

Type

PART TIME

Experience

5–5 yrs

Education

Degree in Computer Science, Software Eng

AI-Extracted Insights

Domain Areas

ai-coding-agents

real-world-developer-tasks

simulated-environments

development-history

developer-workstation-emulation

web-application-codebase

ANONYMOUS · UNFILTERED

What do employees actually say about Mindrift?

Real rants from real employees. Read before you apply.

Read Company Rants →