White Circle

Technology

DataLabeler

$30–50k Remote FULL TIME Remote Friendly

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid+ candidates.

The Brief

“Data Labeler at White Circle. Skills: Data labeling, AI evaluation, Model assessment. Review AI conversations. Evaluate AI conversations”

Industry & Context.

Technology

Problems you'll solve

Analyze nuanced situations

Eligibility Requirements

Comfortable working with sensitive content

What They're Looking For.

Must Have

Exceptional attention to detail, Consistent decisions across large volumes, Follow guidelines, Written English skills, Communicate clearly, Explain reasoning well

Nice to Have

Experience with content moderation, Experience with trust & safety, Experience with quality assurance, Experience with compliance, Experience with policy enforcement, Experience in data annotation, Experience in AI evaluation, Experience in RLHF, Experience in model assessment, Worked with AI tools extensively, Understand AI strengths and limitations, Enjoy finding edge cases, Enjoy finding unusual model behavior

What You'll Do.

Review AI conversations

Evaluate AI conversations

Evaluate model outputs

Assess responses for safety

Assess responses for quality

Assess responses for accuracy

Assess responses for policy compliance

Assess responses for user intent

Identify harmful behavior

Identify unsafe behavior

Identify misleading behavior

Identify low-quality behavior

Categorize model outputs

Moderate sensitive content

Identify policy violations

Compare model responses

Score model responses

Investigate edge cases

Investigate ambiguous situations

Provide structured feedback

Improve evaluation guidelines

Improve annotation processes

Contribute to datasets

How You'll Work.

Communication Scope

Explain reasoning

Full Job Description

ABOUT US White Circle https://whitecircle.ai/ is an AI Safety company building the safety, reliability, and optimization layer for AI systems. At the core of our platform are policies – simple natural-language rules that define what an AI model should and shouldn’t do. We automatically test, enforce, and continuously improve these policies at scale. - We’ve raised $11M from top funds, founders, and senior leaders at OpenAI, Anthropic, HuggingFace, Mistral, DeepMind, Datadog, Sentry, and others - We process over one hundred million API calls every month - We fine-tune and train our own LLMs so they run faster and cheaper than any open or proprietary model We’re a small, highly focused team. If you want to work deeply on hard problems, see your work ship to production quickly, and influence how AI safety is actually built – you’re the one we need. IN THIS ROLE, YOU WILL - Review and evaluate AI conversations and model outputs - Assess responses for safety, quality, accuracy, policy compliance, and user intent - Identify harmful, unsafe, misleading, or low-quality behavior - Label and categorize model outputs according to internal evaluation frameworks - Moderate sensitive content and identify policy violations - Compare, rank, and score model responses - Investigate edge cases and ambiguous situations - Provide structured feedback to researchers and engineers - Help improve evaluation guidelines and annotation processes - Contribute to the datasets used to train and evaluate AI systems WE'RE LOOKING FOR SOMEONE WHO - Has exceptional attention to detail - Can make consistent decisions across large volumes of data - Enjoys analysing nuanced situations where there isn't always a clear answer - Can follow guidelines while exercising good judgment - Has strong written English skills - Communicates clearly and explains reasoning well - Is curious about AI and how these systems work YOU MIGHT BE A GREAT FIT IF YOU - Have experience with content moderation, trust & safety, qu

Free ATS check

Applying for this Data Labeler role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 19 detected · ranked by frequency

Data labeling ×5

AI evaluation ×5

Model assessment ×5

Model output evaluation ×3

Content moderation ×3

RLHF ×3

LLMs

Policy compliance

User intent assessment

Harmful behavior identification

Low-quality behavior identification

Policy violation identification

Edge case investigation

Ambiguous situation investigation

Feedback provision

Evaluation guideline improvement

Annotation process improvement

Dataset contribution

BEHAVIOURAL

Curious about AI

Role Details

Work Mode Remote

Type FULL TIME

Category research

Salary Band 30k-50k

AI-Extracted Insights

Domain Areas

ai-safetyai-systemsllms

How to Apply on Ashby

Ashby is a fast modern ATS — most applications take under 3 minutes.
The resume parser is strong; verify parsed experience dates and job titles.
Custom screening questions are often scored algorithmically — answer completely.
Location field affects geo-based screening; use your actual metro area.

ANONYMOUS · UNFILTERED

What do employees actually say about White Circle?

Real rants from real employees. Read before you apply.

Read Company Rants →