Skip to content

Research, Post-Training Data

350k – 475kSan Francisco, CAML EngineeringOnsite
Summary

Conducts post-training research for AI models, designing data collection strategies, developing labeling pipelines, modeling human preferences, and iterating on evaluations to improve model alignment, reasoning, and helpfulness. Requires strong Python skills, ML framework proficiency, and experimental rigor.

About the role

What You’ll Do

  • Design and execute data collection and synthesis strategies for post-training by combining human feedback, preference data, and synthetic examples to guide model behavior.
  • Develop pipelines and frameworks for scalable, high-quality human labeling, model-assisted labeling, and synthetic data generation.
  • Research and model human preferences and behavior, creating data-driven methods to improve reasoning, truthfulness, and helpfulness.
  • Iterate on evals: post-training involves a never-ending loop of defining a set of evaluations, optimizing them, and then realizing your existing evals don’t capture what matters. You’ll be responsible for both making numbers go up, and making sure the numbers are meaningful.
  • Design and evaluate metrics and benchmarks that measure data quality, alignment, and the real-world impact of post-training interventions.
  • Scale and explore: post-training will involve a combination of scaling the existing methodologies and developing new ones.
  • Publish and present research that moves the entire community forward. Share code, datasets, and insights that accelerate progress across industry and academia.

Skills and Qualifications

Minimum qualifications:

  • Strong engineering skills, ability to contribute code and debug in complex codebases.
  • Experience with data curation, human feedback, or synthetic data generation for large language models or similar systems.
  • Ability to design, run, and interpret experiments with scientific rigor and clarity.
  • Proficiency in Python and familiarity with at least one deep learning framework (e.g., PyTorch, TensorFlow, or JAX). Comfortable with debugging distributed training and writing code that scales.
  • Bachelor’s degree or equivalent experience in Computer Science, Machine Learning, Physics, Mathematics, or a related discipline with strong theoretical and empirical grounding.
  • Clarity in communication, an ability to explain complex technical concepts in writing.

Preferred qualifications:

  • A strong grasp of probability, statistics, and ML fundamentals. You can look at experimental data and distinguish between real effects, noise, and bugs.
  • Prior experience with RLHF, RLAIF, preference modeling, or reward learning for large models.
  • Experience managing or analyzing human data collection campaigns or large-scale annotation workflows.
  • Research or engineering contributions in alignment, data-centric AI, or human-AI collaboration.
  • Familiarity with synthetic data pipelines, active learning, or model-assisted labeling
  • PhD in Computer Science, Machine Learning, Physics, Mathematics, or a related discipline with strong theoretical and empirical grounding; or, equivalent industry research experience.

Logistics

Compensation: Depending on background, skills and experience, the expected annual salary range for this position is $350,000 - $475,000 USD.

Benefits: Thinking Machines offers generous health, dental, and vision benefits, unlimited PTO, paid parental leave, and relocation support as needed.

Skills
PythonPyTorchTensorFlowJAXRLHFRLAIFsynthetic data generationhuman feedbackpreference modelingdata pipelines
Similar roles at this salary range
All ML Engineering jobs →
xAI

Member of Technical Staff

Hands-on technical contributor focused on stabilizing and advancing large language model training, fine-tuning, and research in AI/deep learning. Requires a bachelor's degree and 2+ years of experience with distributed systems, ML infrastructure, and programming in Rust/C++/Python.

324k – 396kPalo Alto, CAML EngineeringOn-site2+ YOEC++GPU
xAI

Member of Technical Staff

Hands-on technical leader building and scaling large language models and AI systems. Requires 3-5+ years of AI/ML experience with strong Python and deep learning frameworks.

324k – 396kPalo Alto, CAML EngineeringOn-site5+ YOEC++JAX
Anthropic

Research Engineer, Safeguards Labs

Research engineer on the Safeguards Labs team building and evaluating novel safety methods to detect misuse, strengthen model safeguards, and reduce real-world harm from Claude.

350k – 850kSan Francisco, CA +1ML EngineeringHybridPythonClassifiers
OpenAI

Research Engineer / Research Scientist

Research and develop improvements to models' personalization and agentic capabilities through reinforcement learning, dataset creation, and post-training methods. Requires strong ML engineering skills and research experience with novel models.

295k – 555kSan Francisco, CAML EngineeringHybrid7+ YOEPythonPyTorch
Anthropic

Research Engineer/Research Scientist, Audio

Research Engineer/Scientist role focused on advancing audio capabilities in large language models, including training speech/audio models, developing codecs, and building conversational AI systems. Requires strong experience in audio ML research and engineering with JAX or PyTorch.

350k – 500kSan Francisco, CAML EngineeringHybrid7+ YOEJAXPyTorch