Skip to content

Researcher, Agent Post-Training, Personality

295k – 445kSan Francisco, CAML EngineeringOnsite7+ YOE
Summary

Researcher on the Agent Post-Training Personality team shaping how frontier agents communicate, collaborate, and build trust. Focuses on turning qualitative behavioral insights into evals, training data, reward signals, and model improvements.

About the role

Responsibilities

  • Develop a rigorous understanding of what makes an agent a great collaborator across professional, creative, technical, and everyday work.
  • Turn qualitative judgments about model behavior into concrete hypotheses, evals, graders, and training interventions.
  • Study explicit and implicit user signals to understand which behaviors create trust, satisfaction, continued use, and successful outcomes.
  • Work with human experts and trainers to produce high-quality, tasteful rollouts and preference data that capture excellent collaborative behavior.
  • Improve reward models and RL objectives for model behaviors.
  • Work with pretraining and early-training teams on data mixtures, objectives, synthetic data, and other upstream choices that shape downstream personality.
  • Build sustainable pipelines for updating older training data as our understanding of excellent model behavior evolves.
  • Partner closely with ChatGPT, Codex, and other product teams to turn consumer insight into model improvements and validate them in real workflows.
  • Own projects end to end, from observing a subtle behavioral failure through experimentation, training, evaluation, and launch.

Requirements

  • Strong technical foundations in machine learning, software engineering, statistics, behavioral science, HCI, or a related field.
  • Strong taste for model behavior: ability to explain why one response feels thoughtful, natural, and useful while another does not.
  • Experience with LLMs, post-training, RL/RLHF, reward modeling, evals, synthetic data, pretraining data, or production ML systems.
  • Ability to translate subjective-seeming product questions into falsifiable hypotheses and rigorous evaluations without losing nuance.
  • Ability to work effectively with researchers, engineers, product teams, designers, domain experts, human-data teams, and safety boundaries.
  • Ability to communicate clearly across technical and non-technical groups.
  • Excitement for ambiguous capability problems where the signal is noisy, the failures are qualitative, and the solution may involve data, training, evals, product changes, or all of the above.

Nice-to-Haves

  • Instinctive user perspective and deep care about how models feel to work with.
  • Care about preserving individuality, adaptability, and behavioral diversity.
  • Experience building load-bearing systems and processes.
Skills
Machine LearningLLMsRLHFReward ModelingEvalsSynthetic DataPretraining DataProduction ML SystemsBehavioral ScienceHCI
Similar roles at this salary range
All ML Engineering jobs →
OpenAI

Research Engineer / Research Scientist

Research and develop improvements to models' personalization and agentic capabilities through reinforcement learning, dataset creation, and post-training methods. Requires strong ML engineering skills and research experience with novel models.

295k – 555kSan Francisco, CAML EngineeringHybrid7+ YOEPythonPyTorch
Anthropic

Research Engineer/Research Scientist, Audio

Research Engineer/Scientist role focused on advancing audio capabilities in large language models, including training speech/audio models, developing codecs, and building conversational AI systems. Requires strong experience in audio ML research and engineering with JAX or PyTorch.

350k – 500kSan Francisco, CAML EngineeringHybrid7+ YOEJAXPyTorch
Garner Health

Staff Applied Scientist

Build and own algorithmic systems that evaluate providers, make recommendations, and optimize healthcare outcomes for cost, quality, and access. Requires 3+ years shipping data-driven algorithms to production.

260k – 382kNew York City, NYML EngineeringHybrid3+ YOESQLAWS
The Voleon Group

Senior Member of Research Staff, Optimization

Lead optimization research applying large-scale constrained optimization and ML to real-time trading decisions. Requires 5-10+ years experience, strong math/ML background, production coding skills, and PhD-level coursework.

300k – 325kBerkeley, CA +1ML EngineeringHybrid5+ YOEC++Python
Garner Health

Senior Machine Learning Operations Engineer

Build and operate production ML systems and platform components for healthcare technology, partnering with ML and data science teams on model deployment, observability, and reliability.

256k – 285kNew York City, NYML EngineeringHybrid5+ YOES3AWS