Handshake

AIRedTeamer(LLMGeneralist)

$0–0k Seattle, Washington, United States FULL TIME

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid+ candidates.

The Brief

“AI Red Teamer (LLM Generalist) at Handshake. Skills: LLM Red Teaming, Adversarial Prompting, AI Safety Testing. stress-test large language models. design creative prompts”

What You'll Achieve.

strengthen defenses

Industry & Context.

Problems you'll solve

adversarial problem-solving skills; rigorous documentation; unusual interests; fandoms; niche internet cultures; gaming exploits; Wikipedia rabbit holes

Eligibility Requirements

Work with potentially disturbing content on a regular basis

What They're Looking For.

Must Have

hands-on experience using multiple LLMs, ethical judgment, ability to separate adversarial thinking from personal values

Nice to Have

familiarity with jailbreak or evasion techniques, Familiarity with Python or other scripting languages, Experience working with LLM APIs or evaluation tooling, Comfort with structured data annotation and rubric-based scoring, Prior work in trust and safety, content moderation, QA, or security research, Subject matter expertise in any high-risk domain

What You'll Do.

stress-test large language models

design creative prompts

expose vulnerabilities

probe models across risk categories

Craft creative prompts

stress-test AI guardrails

Discover ways around safety filters

provoke disallowed outputs

Evaluate and score model responses

Document experiments clearly

Review and refine adversarial prompts

Contribute to harm taxonomy development

Contribute to calibration exercises

Contribute to inter-rater reliability work

Work with potentially disturbing content

Stay current on jailbreaks

How You'll Work.

Team & Collaboration

Collaborate with engineers; Collaborate with data scientists; Collaborate with researchers; share findings; strengthen defenses; collaborative

Communication Scope

Clear and thoughtful written communication

Full Job Description

ABOUT HANDSHAKE Handshake was founded on a simple belief that everyone deserves a path to a great career, regardless of where they went to school or who they know. Today, we power 25 million job seekers, 1 million+ employers, and 1,600 educational institutions. In 2025, we started Handshake AI and built the fastest-growing AI data business in history. We work directly with frontier AI lab researchers to create evaluations, publish benchmarks, and push the boundary of data. We’ve grown from $0 to ~$1B run rate and pay ~$60M to over 30K individuals every month. Why join Handshake now: - Shape how every career evolves in the AI economy, at global scale, with impact your friends, family and peers can see and feel - Partner hand-in-hand with world-class AI labs, Fortune 500 partners and the world’s top educational institutions - Work together with engineers, scientists, operators, and more from Palantir, Meta, Scale AI, and former YC founders - Build a massive, fast-growing business with billions in revenue About Handshake AI Human data is the core infrastructure to AI advancement. Frontier AI labs currently improve model capabilities with various data-intensive post-training techniques. We believe that data spend for AI training will increase by 3-5x in the next few years and continue for much longer as models take on new domains. Handshake AI supports all of the frontier AI labs, working on their most complex data at the largest scale. About the Role As an AI Red Teamer, you will stress-test large language models by intentionally trying to break them. Rather than checking whether an answer is correct, you will design creative, adversarial prompts that expose vulnerabilities: unsafe content, bias, broken guardrails, hallucinations, prompt injection weaknesses, and unexpected behaviors. Your work directly supports AI safety and model robustness for leading research labs. This is a generalist red teaming role. You will probe models across the full spectrum of risk categor

Free ATS check

Applying for this AI Red Teamer (LLM Generalist) role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 100 detected · ranked by frequency

self-harm ×6

ChatGPT ×4

Claude ×4

Gemini ×4

open-source models ×4

stress-test large language models ×3

design creative prompts ×3

adversarial prompts ×3

expose vulnerabilities ×3

unsafe content ×3

bias ×3

broken guardrails ×3

hallucinations ×3

prompt injection weaknesses ×3

unexpected behaviors ×3

content safety ×3

CBRN ×3

cybersecurity ×3

persuasion and influence operations ×3

child safety ×3

over-companionship ×3

regulatory compliance ×3

text model capabilities ×3

image model capabilities ×3

voice model capabilities ×3

agentic model capabilities ×3

Craft creative prompts ×3

multi-turn scenarios ×3

stress-test AI guardrails ×3

risk categories ×3

Discover ways around safety filters ×3

restrictions ×3

BEHAVIOURAL

creativitycuriosityethical judgmentSelf-directedcollaborativeCuriositypersistencecomfort with frequent failure

Role Details

Type FULL TIME

Category general-&-administrative

Salary Band <30k

AI-Extracted Insights

Domain Areas

ai-safetymodel-robustnesscontent-safetycbrncybersecuritypersuasion-and-influence-operationschild-safetyself-harm

How to Apply on Ashby

Ashby is a fast modern ATS — most applications take under 3 minutes.
The resume parser is strong; verify parsed experience dates and job titles.
Custom screening questions are often scored algorithmically — answer completely.
Location field affects geo-based screening; use your actual metro area.

ANONYMOUS · UNFILTERED

What do employees actually say about Handshake?

Real rants from real employees. Read before you apply.

Read Company Rants →