Model Policy Manager

Defines and maintains policies for AI model behavior in high-risk domains like agentic systems and user safety. Collaborates with research, engineering, and product teams to operationalize policies into measurable safeguards using empirical data and red-teaming.

207k – 295kSan Francisco, CASecurity EngineeringHybrid

Apply

About the role

Responsibilities

Design and maintain model policies across safety-relevant domains, including dual-use, agentic, and emerging frontier-risk areas.
Translate risk and harm models into clear behavioral specifications, evaluation criteria, grading guidance, and system-level safeguards.
Define practical boundaries between beneficial uses of AI and assistance that could materially enable harm, exploitation, misuse, or unsafe outcomes.
Build policy artifacts that support model training, evaluation, and deployment.
Partner with safety researchers, engineers, product teams, and other stakeholders to operationalize policy into scalable model behavior and measurable safeguards.
Use red-teaming results, deployment data, model failures, over-refusals, under-refusals, and ambiguous edge cases to improve policy and evaluation quality over time.
Identify emerging capability areas where frontier AI systems could create new safety challenges or lower barriers to harm.
Study real-world deployments to identify where model behavior succeeds, fails, or drifts from the intended safety posture.
Combine longer-horizon safety research with hands-on launch and deployment work.
Contribute to system cards, safety reports, policy documentation, launch reviews, and external communications on OpenAI's approach to model safety and risk mitigation.
Design and run human data campaigns, including gold set construction, labeling guidance, calibration, adjudication, and eval coverage analysis, to ensure policies can be reliably measured and improved.

Requirements

Strong judgment about how advanced AI systems may affect real-world risk, especially in ambiguous, fast-moving, or high-impact areas.
Experience building or applying policies, taxonomies, harm models, threat models, or risk frameworks for complex technical, social, or adversarial systems.
Ability to move across domains without needing to be the deepest subject-matter expert in every area, while knowing when to seek expert input.
Can turn fuzzy questions into structured policy frameworks, evaluation criteria, operational guidance, and enforceable model behavior.
Comfortable using empirical evidence, including evaluations, red-teaming results, deployment observations, and model failure modes, to inform policy decisions.
Think in systems across policy, data, graders, classifiers, training, deployment safeguards, measurement, monitoring, and escalation workflows.
Technical judgment about what model behavior can realistically be trained, measured, evaluated, and enforced at scale.
Work well across research, engineering, product, policy, domain experts, and operational teams.
Write clearly about complex tradeoffs where safety, user value, and implementation constraints all matter.
Pragmatic approach to safety, focused on reducing real-world risk while preserving legitimate, beneficial, and socially valuable uses of AI.
Enjoy fast-paced, collaborative research environments where priorities shift as models, evidence, and risks change.
Stay grounded in implementation details, empirical results, and what can actually be trained or measured.

Skills

Ai SafetyRed-TeamingRisk FrameworksThreat ModelsPolicy FrameworksEvaluation CriteriaHarm ModelsSystem SafeguardsModel TrainingModel Deployment

Similar roles

Security Engineering jobs

OpenAI

Model Policy, Frontier Cyber Risk

Develops and maintains AI model policies for high-risk cybersecurity domains, translating threat models into behavioral specifications, evaluations, and mitigations. Collaborates with research, engineering, and safety teams to ensure technically grounded, enforceable safeguards against dual-use risks.

207k – 295kSan Francisco, CASecurity EngineeringHybridRed-TeamingCloud Security

Scale AI

Security Engineer, Product Security

Security Engineer conducts code reviews, implements secure CI/CD pipelines, performs SAST/DAST testing, and secures AWS infrastructure using Terraform. Requires expertise in TypeScript, Python, NodeJS, and product security best practices to mitigate vulnerabilities in AI/ML products.

206k – 297kNew York, NY +2Security EngineeringOn-siteAWSSAST

AKASA

Application Security Engineer

Application Security Engineer conducts secure code reviews, threat modeling, and automates security tooling with AI in CI/CD pipelines to protect patient data systems. Requires 5+ years app sec experience, coding proficiency in modern languages, and cloud/container security knowledge.

205k – 275kSouth San Francisco, CASecurity EngineeringHybrid5+ YOEGoSca

Replit

Security Engineer - Vuln Management (Infra)

Mid-level Infrastructure Vulnerability Management Engineer responsible for cloud security posture, IaC scanning, container vulnerability management, and compliance tracking across multi-cloud environments. Requires 5+ years in cloud security/DevSecOps with deep GCP expertise.

210k – 270kFoster City, CASecurity EngineeringHybrid5+ YOEGCPAWS

Replit

Security Engineer - Vuln Management (Code)

Mid-level AppSec Vulnerability Management Engineer who identifies application vulnerabilities, manages SBOM and supply chain security, and drives compliance tracking for SOC 2, ISO 27001, and PCI-DSS. Requires 5+ years in AppSec/DevSecOps with strong coding skills in JS/TS, Python, and Go.

210k – 270kFoster City, CASecurity EngineeringHybrid5+ YOEGoSca