OpenAI

AI Research and Deployment

ModelPolicy

$207–295k San Francisco, California, United States FULL TIME Remote Friendly

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid+ candidates.

The Brief

“Model Policy at OpenAI. Skills: Model policy design and maintenance, Translating risk into behavioral specifications, Operationalizing policy into scalable model behavior, Using empirical evidence to inform policy decisions, Technical judgment on model trainability and measurability. Design and maintain model policies across safety-relevant domains, including dual-use, agentic, and emerging frontier-risk areas. Translate risk and harm models into clear behavioral specifications, evaluation crite”

What You'll Achieve.

Align model behavior with desired human values and norms; Drive rapid policy taxonomy iteration based on data; Define evaluation criteria for foundational models’ ability to reason about safety; Define how OpenAI’s models should behave in high-risk or high-ambiguity contexts; Build policies that are technically grounded, measurable, and responsive to real-world risk; Ensure policies can be reliably measured and improved

Industry & Context.

AI Research and Deployment

Problems you'll solve

Reason from first principles; Turn ambiguity into practical model behavior; Think in systems across policy, data, graders, classifiers, training, deployment safeguards, measurement, monitoring, and escalation workflows

Eligibility Requirements

Relocation support to new employees, Hybrid model: three days in the office per week with optional work from home on Thursdays and Fridays

What They're Looking For.

Must Have

Experience building or applying policies, taxonomies, harm models, threat models, or risk frameworks for complex technical, social, or adversarial systems, Ability to move across domains without needing to be the deepest subject-matter expert in every area, while knowing when to seek expert input, Ability to turn fuzzy questions into structured policy frameworks, evaluation criteria, operational guidance, and enforceable model behavior, Comfortable using empirical evidence, including evaluations, red-teaming results, deployment observations, and model failure modes, to inform policy decisions, Technical judgment about what model behavior can realistically be trained, measured, evaluated, and enforced at scale, Ability to work well across research, engineering, product, policy, domain experts, and operational teams, Ability to write clearly about complex tradeoffs where safety, user value, and implementation constraints all matter, Pragmatic approach to safety, focused on reducing real-world risk while preserving legitimate, beneficial, and socially valuable uses of AI, Grounded in implementation details, empirical results, and what can actually be trained or measured

Nice to Have

Specific expertise or speciality related to model policy, Judgment about how advanced AI systems may affect real-world risk, especially in ambiguous, fast-moving, or high-impact areas, Experience in fast-paced, collaborative research environments where priorities shift as models, evidence, and risks change

What You'll Do.

Design and maintain model policies across safety-relevant domains

and emerging frontier-risk areas

Translate risk and harm models into clear behavioral specifications

and system-level safeguards

Define practical boundaries between beneficial uses of AI and assistance that could materially enable harm

Build policy artifacts that support model training

Use red-teaming results

and ambiguous edge cases to improve policy and evaluation quality over time

Identify emerging capability areas where frontier AI systems could create new safety challenges or lower barriers to harm

Study real-world deployments to identify where model behavior succeeds

or drifts from the intended safety posture

Combine longer-horizon safety research with hands-on launch and deployment work

Contribute to system cards

and external communications on OpenAI's approach to model safety and risk mitigation

Design and run human data campaigns

including gold set construction

and eval coverage analysis

to ensure policies can be reliably measured and improved

How You'll Work.

Team & Collaboration

Work closely with research, engineering, product, preparedness, and operations teams to build policies that are technically grounded, measurable, and responsive to real-world risk; Partner with safety researchers, engineers, product teams, and other stakeholders to operationalize policy into scalable model behavior and measurable safeguards; Work well across research, engineering, product, policy, domain experts, and operational teams

Communication Scope

Write clearly about complex tradeoffs where safety, user value, and implementation constraints all matter

Full Job Description

About the Team Our Safety Systems https://openai.com/safety/safety-systems team is at the forefront of OpenAI's mission to build and deploy safe AGI, driving our commitment to AI safety and fostering a culture of trust and transparency. Within Safety Systems, the Model Policy team aligns model behavior with desired human values and norms. We co-design policy with models and for models by driving rapid policy taxonomy iteration based on data and defining evaluation criteria for foundational models’ ability to reason about safety. About the Role If you have a specific expertise or speciality related to this work, please note it in your application via your resume, cover letter or application note. Frontier AI systems are expanding what people can do across domains, creating both enormous opportunities and difficult safety questions: when should a model help, when should it refuse, and how do we make those boundaries clear enough to train, evaluate, and enforce? In this role, you will help define how OpenAI’s models should behave in high-risk or high-ambiguity contexts, such as agentic systems, multimodal systems, user safety, privacy, and other emerging risk domains. This is an ideal role for someone who can move across unfamiliar topics, reason from first principles, and turn ambiguity into practical model behavior. You will work closely with research, engineering, product, preparedness, and operations teams to build policies that are technically grounded, measurable, and responsive to real-world risk. In this role, you will: - Design and maintain model policies across safety-relevant domains, including dual-use, agentic, and emerging frontier-risk areas. - Translate risk and harm models into clear behavioral specifications, evaluation criteria, grading guidance, and system-level safeguards. - Define practical boundaries between beneficial uses of AI and assistance that could materially enable harm, exploitation, misuse, or unsafe outcomes. - Build policy artifacts t

Free ATS check

Applying for this Model Policy role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 49 detected · ranked by frequency

Translating risk and harm models into clear behavioral specifications ×3

Defining practical boundaries between beneficial uses of AI and assistance that could materially enable harm, exploitation, misuse, or unsafe outcomes ×3

Building policy artifacts that support model training, evaluation, and deployment ×3

Using red-teaming results, deployment data, model failures, over-refusals, under-refusals, and ambiguous edge cases to improve policy and evaluation quality ×3

Studying real-world deployments to identify where model behavior succeeds, fails, or drifts from the intended safety posture ×3

Designing and running human data campaigns ×3

Model policy design and maintenance ×2

Translating risk into behavioral specifications ×2

Operationalizing policy into scalable model behavior ×2

Using empirical evidence to inform policy decisions ×2

Technical judgment on model trainability and measurability ×2

AI systems

Frontier AI systems

Agentic systems

Multimodal systems

Foundational models

Policy design

Behavioral specifications

Evaluation criteria

Grading guidance

System-level safeguards

Risk mitigation

Harm models

Threat models

Risk frameworks

Policy taxonomy iteration

Data analysis

Empirical evidence utilization

Policy operationalization

Model behavior alignment

Human data campaign design

Gold set construction

BEHAVIOURAL

Ability to move across unfamiliar topicsReasoning from first principlesTurning ambiguity into practical model behaviorCollaborationPragmatismAdaptability

Role Details

Work Mode hybrid

Type FULL TIME

Category model-policy

Salary Band 200k+

AI-Extracted Insights

Domain Areas

ai-safetyai-riskagentic-systemsmultimodal-systemsuser-safetyprivacyemerging-risk-domainsdual-use-technology

How to Apply on Ashby

Ashby is a fast modern ATS — most applications take under 3 minutes.
The resume parser is strong; verify parsed experience dates and job titles.
Custom screening questions are often scored algorithmically — answer completely.
Location field affects geo-based screening; use your actual metro area.

ANONYMOUS · UNFILTERED

What do employees actually say about OpenAI?

Real rants from real employees. Read before you apply.

Read Company Rants →