Researcher, Agent Post-Training, Personality
Researcher on the Agent Post-Training Personality team shaping how frontier agents communicate, collaborate, and build trust. Focuses on turning qualitative behavioral insights into evals, training data, reward signals, and model improvements.
Responsibilities
- Develop a rigorous understanding of what makes an agent a great collaborator across professional, creative, technical, and everyday work.
- Turn qualitative judgments about model behavior into concrete hypotheses, evals, graders, and training interventions.
- Study explicit and implicit user signals to understand which behaviors create trust, satisfaction, continued use, and successful outcomes.
- Work with human experts and trainers to produce high-quality, tasteful rollouts and preference data that capture excellent collaborative behavior.
- Improve reward models and RL objectives for model behaviors.
- Work with pretraining and early-training teams on data mixtures, objectives, synthetic data, and other upstream choices that shape downstream personality.
- Build sustainable pipelines for updating older training data as our understanding of excellent model behavior evolves.
- Partner closely with ChatGPT, Codex, and other product teams to turn consumer insight into model improvements and validate them in real workflows.
- Own projects end to end, from observing a subtle behavioral failure through experimentation, training, evaluation, and launch.
Requirements
- Strong technical foundations in machine learning, software engineering, statistics, behavioral science, HCI, or a related field.
- Strong taste for model behavior: ability to explain why one response feels thoughtful, natural, and useful while another does not.
- Experience with LLMs, post-training, RL/RLHF, reward modeling, evals, synthetic data, pretraining data, or production ML systems.
- Ability to translate subjective-seeming product questions into falsifiable hypotheses and rigorous evaluations without losing nuance.
- Ability to work effectively with researchers, engineers, product teams, designers, domain experts, human-data teams, and safety boundaries.
- Ability to communicate clearly across technical and non-technical groups.
- Excitement for ambiguous capability problems where the signal is noisy, the failures are qualitative, and the solution may involve data, training, evals, product changes, or all of the above.
Nice-to-Haves
- Instinctive user perspective and deep care about how models feel to work with.
- Care about preserving individuality, adaptability, and behavioral diversity.
- Experience building load-bearing systems and processes.
Research Engineer / Research Scientist
Research and develop improvements to models' personalization and agentic capabilities through reinforcement learning, dataset creation, and post-training methods. Requires strong ML engineering skills and research experience with novel models.
Research Engineer/Research Scientist, Audio
Research Engineer/Scientist role focused on advancing audio capabilities in large language models, including training speech/audio models, developing codecs, and building conversational AI systems. Requires strong experience in audio ML research and engineering with JAX or PyTorch.
Staff Applied Scientist
Build and own algorithmic systems that evaluate providers, make recommendations, and optimize healthcare outcomes for cost, quality, and access. Requires 3+ years shipping data-driven algorithms to production.
Senior Member of Research Staff, Optimization
Lead optimization research applying large-scale constrained optimization and ML to real-time trading decisions. Requires 5-10+ years experience, strong math/ML background, production coding skills, and PhD-level coursework.