Skip to content

Principal Research Engineer, Post-Training

275k – 400kRedwood City, CAML EngineeringOnsite7+ YOE
Summary

Lead technical vision and execution for post-training systems that transform foundation models into intelligent, engaging products. Drive alignment algorithms, RL, and infrastructure for large-scale LLM training and serving.

About the role

What You'll Do

Technical Leadership & Mentorship

  • Define and drive the technical roadmap for mid- and post-training systems, balancing research innovation with production reliability and scalability
  • Mentor and grow a team of researchers and engineers through technical guidance, design reviews, and career development
  • Establish best practices for experimentation, model development, and deployment

Research & Model Development

  • Lead the development of alignment algorithms, optimization techniques, and training objectives to improve model capabilities and data efficiency
  • Drive advances in mid- and post-training methodologies including reinforcement learning, preference optimization, supervised fine-tuning, and emerging alignment approaches
  • Identify and execute high-impact research opportunities that improve model behavior, safety, and user engagement
  • Develop robust evaluation frameworks and quality signals to measure real-world model performance

Systems & Infrastructure

  • Lead the design of efficient training and inference systems for large-scale generative models
  • Architect scalable data pipelines that transform diverse data sources into high-quality training datasets
  • Partner with infrastructure teams to optimize distributed training, GPU utilization, and serving efficiency
  • Drive improvements in experimentation platforms, data quality systems, and model observability

Required Qualifications

  • PhD in Computer Science, Machine Learning, AI, or a related field, or equivalent industry experience
  • Significant experience leading technical projects or teams in machine learning, AI research, or large-scale distributed systems
  • Experience scaling and mentoring high-performing research and engineering teams
  • Deep understanding of modern machine learning techniques, including transformers, reinforcement learning, alignment methods, and large language models
  • Strong track record of delivering impactful research or applied ML systems in production environments
  • Expertise in designing, building, and maintaining production-quality ML systems and infrastructure
  • Experience training, serving, debugging, and optimizing large-scale models on GPU-based systems
  • Experience leading teams working on large language model training, mid-training, or post-training
  • Experience with product experimentation, online evaluation, and A/B testing frameworks
  • Strong software engineering skills with the ability to write clean, maintainable, and scalable code
  • Excellent communication skills and the ability to influence technical direction across teams
  • Lead complex, cross-functional initiatives across data, training infrastructure, evaluation, and model serving

Nice to Have

  • Hands-on experience working directly with open-source models like Mistral and Qwen, particularly adapting them via mid- and post-training for specific personas, creative writing, or role-playing applications
  • Familiarity with cloud-native ML infrastructure, including Kubernetes, Docker, and modern orchestration platforms
  • Publications in leading machine learning conferences or demonstrated contributions to the broader AI community
Skills
TransformersReinforcement LearningLarge Language ModelsAlignment MethodsGPU-based SystemsDistributed TrainingKubernetesDockerPythonA/B Testing
Similar roles at this salary range
All ML Engineering jobs →
OpenAI

Research Engineer/Research Scientist

Research Engineer/Scientist improving model capabilities for personalized AI experiences. Focus on tool-use, instruction following, evaluations, and training improvements. Requires strong ML engineering and research experience.

295k – 555kSan Francisco, CAML EngineeringHybrid7+ YOEPythonResearch
xAI

Member of Technical Staff

Hands-on technical contributor focused on stabilizing and advancing large language model training, fine-tuning, and research in AI/deep learning. Requires a bachelor's degree and 2+ years of experience with distributed systems, ML infrastructure, and programming in Rust/C++/Python.

324k – 396kPalo Alto, CAML EngineeringOn-site2+ YOEC++GPU
xAI

Member of Technical Staff

Hands-on technical leader building and scaling large language models and AI systems. Requires 3-5+ years of AI/ML experience with strong Python and deep learning frameworks.

324k – 396kPalo Alto, CAML EngineeringOn-site5+ YOEC++JAX
Mixpanel

Senior Software Engineer, AI Platform

Senior Software Engineer building scalable AI infrastructure, agent orchestration frameworks, evaluation systems, and high-performance LLM serving at Mixpanel. Requires 5+ years experience and hands-on LLM/agent work.

226k – 306kSan Francisco, CAML EngineeringHybrid5+ YOELLMsMLOps
Twilio

Tech Lead, Applied Research

Tech Lead driving AI R&D and end-to-end delivery of production-ready prototypes using full-stack development, LLMs, and emerging technologies. Requires 10+ years experience and strong autonomy.

228k – 335kUnited StatesML EngineeringRemote10+ YOEGoSQL