Skip to content

Researcher, Pretraining Safety

Develop techniques to predict and mitigate unsafe behaviors in early-stage base models, design safer pretraining architectures, and integrate safety signals throughout training. Collaborate across safety teams to build robust, scalable safety foundations grounded in real-world risks.

295k – 445kSan Francisco, CAAI ResearchOnsite

About the role

Responsibilities

  • Develop new techniques to predict, measure, and evaluate unsafe behavior in early-stage models
  • Design data curation strategies that improve pretraining priors and reduce downstream risk
  • Explore safe-by-design architectures and training configurations that improve controllability
  • Introduce novel safety-oriented loss functions, metrics, and evals into the pretraining stack
  • Work closely with cross-functional safety teams to unify pre- and post-training risk reduction

Requirements

  • Experience developing or scaling pretraining architectures (LLMs, diffusion models, multimodal models, etc.)
  • Comfortable working with training infrastructure, data pipelines, and evaluation frameworks (e.g., Python, PyTorch/JAX, Apache Beam)
  • Enjoy hands-on research — designing, implementing, and iterating on experiments
  • Enjoy collaborating with diverse technical and cross-functional partners (e.g., policy, legal, training)
  • Data-driven with strong statistical reasoning and rigor in experimental design
  • Value building clean, scalable research workflows and streamlining processes

Skills

PyTorchJAXPythonApache BeamLLMsDiffusion ModelsMultimodal ModelsStatistical ReasoningData PipelinesEvaluation Frameworks

Similar roles

AI Research jobs

Researcher, Misalignment Research

Designs worst-case demonstrations and adversarial evaluations to uncover AGI misalignment risks like deception and power-seeking. Builds automated stress-testing infrastructure and researches alignment failure modes to inform OpenAI's safety strategy. Requires 4+ years in AI red-teaming or adversarial ML.

295k – 445kSan Francisco, CAAI ResearchOn-site4+ YOELLMsAi Safety

Researcher, Loss of Control

Designs and implements mitigation stacks to prevent loss of control risks in frontier AI models, including prevention, monitoring, detection, and enforcement. Requires expertise in deep learning, transformers, PyTorch/TensorFlow, and AI safety research.

295k – 445kSan Francisco, CAAI ResearchOn-siteLLMsPyTorch

Researcher, Synthetic RL

Develops novel reinforcement learning techniques using synthetic environments and feedback to enhance large-scale AI models. Designs experiments, analyzes dynamics, and integrates research into production systems; requires strong RL/ML background and engineering skills.

295k – 445kSan Francisco, CAAI ResearchHybridPythonResearch

Research Engineer / Research Scientist, Post-Training

Research and develop improvements to pre-trained models for deployment in ChatGPT and API using reinforcement learning and product-driven approaches. Requires strong ML engineering, research experience with novel models, and ability to debug large codebases.

295k – 555kSan Francisco, CAAI ResearchHybridLLMsPython

Research Engineer, Codex

Advances AI coding models through research, experimentation, and system optimization on the Codex team. Collaborates to improve code generation, reasoning, and performance for real-world deployment.

295k – 445kSan Francisco, CAAI ResearchHybridLLMsPython