# Model Policy Manager
**Company:** [OpenAI](https://hotfix.jobs/companies/openai)
**Location:** San Francisco, CA
**Salary:** $207K-$295K
**Skills:** Ai Safety, Red-Teaming, Risk Frameworks, Threat Models, Policy Frameworks, Evaluation Criteria, Harm Models, System Safeguards, Model Training, Model Deployment
**Posted:** 2026-05-13
> Defines and maintains policies for AI model behavior in high-risk domains like agentic systems and user safety. Collaborates with research, engineering, and product teams to operationalize policies into measurable safeguards using empirical data and red-teaming.
## Job Description
## Responsibilities
- Design and maintain model policies across safety-relevant domains, including dual-use, agentic, and emerging frontier-risk areas.
- Translate risk and harm models into clear behavioral specifications, evaluation criteria, grading guidance, and system-level safeguards.
- Define practical boundaries between beneficial uses of AI and assistance that could materially enable harm, exploitation, misuse, or unsafe outcomes.
- Build policy artifacts that support model training, evaluation, and deployment.
- Partner with safety researchers, engineers, product teams, and other stakeholders to operationalize policy into scalable model behavior and measurable safeguards.
- Use red-teaming results, deployment data, model failures, over-refusals, under-refusals, and ambiguous edge cases to improve policy and evaluation quality over time.
- Identify emerging capability areas where frontier AI systems could create new safety challenges or lower barriers to harm.
- Study real-world deployments to identify where model behavior succeeds, fails, or drifts from the intended safety posture.
- Combine longer-horizon safety research with hands-on launch and deployment work.
- Contribute to system cards, safety reports, policy documentation, launch reviews, and external communications on OpenAI's approach to model safety and risk mitigation.
- Design and run human data campaigns, including gold set construction, labeling guidance, calibration, adjudication, and eval coverage analysis, to ensure policies can be reliably measured and improved.

## Requirements
- Strong judgment about how advanced AI systems may affect real-world risk, especially in ambiguous, fast-moving, or high-impact areas.
- Experience building or applying policies, taxonomies, harm models, threat models, or risk frameworks for complex technical, social, or adversarial systems.
- Ability to move across domains without needing to be the deepest subject-matter expert in every area, while knowing when to seek expert input.
- Can turn fuzzy questions into structured policy frameworks, evaluation criteria, operational guidance, and enforceable model behavior.
- Comfortable using empirical evidence, including evaluations, red-teaming results, deployment observations, and model failure modes, to inform policy decisions.
- Think in systems across policy, data, graders, classifiers, training, deployment safeguards, measurement, monitoring, and escalation workflows.
- Technical judgment about what model behavior can realistically be trained, measured, evaluated, and enforced at scale.
- Work well across research, engineering, product, policy, domain experts, and operational teams.
- Write clearly about complex tradeoffs where safety, user value, and implementation constraints all matter.
- Pragmatic approach to safety, focused on reducing real-world risk while preserving legitimate, beneficial, and socially valuable uses of AI.
- Enjoy fast-paced, collaborative research environments where priorities shift as models, evidence, and risks change.
- Stay grounded in implementation details, empirical results, and what can actually be trained or measured.
**Apply:** https://hotfix.jobs/jobs/model-policy-manager-at-openai-bac0d405-d3ff-4dcd-8863-ec873984db44
**Canonical:** https://hotfix.jobs/jobs/model-policy-manager-at-openai-bac0d405-d3ff-4dcd-8863-ec873984db44