Handshake
AI
AIRedTeamer(LLMGeneralist)
Neural analysis suggests this role is
optimal for Mid+ candidates.
“AI Red Teamer (LLM Generalist) at Handshake. Skills: LLM Red Teaming, Adversarial Prompting, AI Safety Testing. stress-test large language models. design creative prompts”
What You'll Achieve.
strengthen defenses
Industry & Context.
adversarial problem-solving skills; rigorous documentation; unusual interests; fandoms; niche internet cultures; gaming exploits; Wikipedia rabbit holes
Work with potentially disturbing content on a regular basis
What They're Looking For.
Must Have
hands-on experience using multiple LLMs, ethical judgment, ability to separate adversarial thinking from personal values
Nice to Have
familiarity with jailbreak or evasion techniques, Familiarity with Python or other scripting languages, Experience working with LLM APIs or evaluation tooling, Comfort with structured data annotation and rubric-based scoring, Prior work in trust and safety, content moderation, QA, or security research, Subject matter expertise in any high-risk domain
What You'll Do.
stress-test large language models
design creative prompts
expose vulnerabilities
probe models across risk categories
Craft creative prompts
stress-test AI guardrails
Discover ways around safety filters
provoke disallowed outputs
Evaluate and score model responses
Document experiments clearly
Review and refine adversarial prompts
Contribute to harm taxonomy development
Contribute to calibration exercises
Contribute to inter-rater reliability work
Work with potentially disturbing content
Stay current on jailbreaks
How You'll Work.
Team & Collaboration
Collaborate with engineers; Collaborate with data scientists; Collaborate with researchers; share findings; strengthen defenses; collaborative
Communication Scope
Clear and thoughtful written communication
Full Job Description
ABOUT HANDSHAKE Handshake was founded on a simple belief that everyone deserves a path to a great career, regardless of where they went to school or who they know. Today, we power 25 million job seekers, 1 million+ employers, and 1,600 educational institutions. In 2025, we started Handshake AI and built the fastest-growing AI data business in history. We work directly with frontier AI lab researchers to create evaluations, publish benchmarks, and push the boundary of data. We’ve grown from $0 to ~$1B run rate and pay ~$60M to over 30K individuals every month. Why join Handshake now: - Shape how every career evolves in the AI economy, at global scale, with impact your friends, family and peers can see and feel - Partner hand-in-hand with world-class AI labs, Fortune 500 partners and the world’s top educational institutions - Work together with engineers, scientists, operators, and more from Palantir, Meta, Scale AI, and former YC founders - Build a massive, fast-growing business with billions in revenue About Handshake AI Human data is the core infrastructure to AI advancement. Frontier AI labs currently improve model capabilities with various data-intensive post-training techniques. We believe that data spend for AI training will increase by 3-5x in the next few years and continue for much longer as models take on new domains. Handshake AI supports all of the frontier AI labs, working on their most complex data at the largest scale. About the Role As an AI Red Teamer, you will stress-test large language models by intentionally trying to break them. Rather than checking whether an answer is correct, you will design creative, adversarial prompts that expose vulnerabilities: unsafe content, bias, broken guardrails, hallucinations, prompt injection weaknesses, and unexpected behaviors. Your work directly supports AI safety and model robustness for leading research labs. This is a generalist red teaming role. You will probe models across the full spectrum of risk categor
Applying for this AI Red Teamer (LLM Generalist) role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Ashby
- Ashby is a fast modern ATS — most applications take under 3 minutes.
- The resume parser is strong; verify parsed experience dates and job titles.
- Custom screening questions are often scored algorithmically — answer completely.
- Location field affects geo-based screening; use your actual metro area.
ANONYMOUS · UNFILTERED
What do employees actually say about Handshake?
Real rants from real employees. Read before you apply.