Research, Post-Training
Develops and tunes post-training recipes for AI models, iterates on evaluations, debugs configurations, scales methodologies, and publishes research to advance collaborative intelligence. Requires Python proficiency, deep learning frameworks, and strong ML fundamentals.
What You’ll Do
- Develop and tune the recipe: iterate on post-training recipes, consisting of a collection of datasets, training stages, and hyperparameters. Measure how recipe choices affect various metrics.
- Iterate on evals: post-training involves a never-ending loop of defining a set of evaluations, optimizing them, and then realizing your existing evals don’t capture what matters. You’ll be responsible for both making numbers go up, and making sure the numbers are meaningful.
- Debug and understand: while tuning the details of a training configuration, we often observe results that don’t quite make sense. You’ll be responsible for both getting things to work, and developing a deeper understanding, which we can bring to the next problem.
- Scale and explore: post-training will involve a combination of scaling the existing methodologies and developing new ones. We’ll want to both measure how performance metrics scale with dataset size, and explore using a completely different kind of training dataset.
- Publish and present research that moves the entire community forward. Share code, datasets, and insights that accelerate progress across industry and academia.
Skills and Qualifications
Minimum qualifications:
- Proficiency in Python and familiarity with at least one deep learning framework (e.g., PyTorch, TensorFlow, or JAX). Comfortable with debugging distributed training and writing code that scales.
- Bachelor’s degree or equivalent experience in Computer Science, Machine Learning, Physics, Mathematics, or a related discipline with strong theoretical and empirical grounding.
- Clarity in communication, an ability to explain complex technical concepts in writing.
Preferred qualifications:
- A strong grasp of probability, statistics, and ML fundamentals. You can look at experimental data and distinguish between real effects, noise, and bugs.
- Prior experience with RLHF, RLAIF, preference modeling, or reward learning for large models.
- Experience managing or analyzing human data collection campaigns or large-scale annotation workflows.
- Research or engineering contributions in alignment, data-centric AI, or human-AI collaboration.
- PhD in Computer Science, Machine Learning, Physics, Mathematics, or a related discipline with strong theoretical and empirical grounding; or, equivalent industry research experience.
Logistics
Compensation: Depending on background, skills and experience, the expected annual salary range for this position is $350,000 - $475,000 USD.
Member of Technical Staff
Hands-on technical contributor focused on stabilizing and advancing large language model training, fine-tuning, and research in AI/deep learning. Requires a bachelor's degree and 2+ years of experience with distributed systems, ML infrastructure, and programming in Rust/C++/Python.
Research Engineer / Research Scientist
Research and develop improvements to models' personalization and agentic capabilities through reinforcement learning, dataset creation, and post-training methods. Requires strong ML engineering skills and research experience with novel models.
Research Engineer/Research Scientist, Audio
Research Engineer/Scientist role focused on advancing audio capabilities in large language models, including training speech/audio models, developing codecs, and building conversational AI systems. Requires strong experience in audio ML research and engineering with JAX or PyTorch.