Skip to content

Member of Technical Staff - Post-Training and RL

Develops advanced post-training and reinforcement learning techniques like RLHF/DPO and reward modeling to enhance AI model reasoning, truthfulness, and real-world capabilities at xAI. Seeks passionate AI enthusiasts obsessed with truth-seeking models; prior experience preferred but not required.

180k – 600kPalo Alto, CAML EngineeringOnsite

About the role

Responsibilities

  • Work on critical post-training and reinforcement learning challenges, including reward modeling, preference optimization (RLHF/DPO), and RL for improving reasoning, truthfulness, and real-world capabilities.

Basic Qualifications

  • Believe truth-seeking AI is the most important and challenging problem.
  • Obsessed about building incredibly useful models through post-training and RL techniques.
  • Power user of AI models and eager to push boundaries with reinforcement learning and alignment methods.
  • Previous work on post-training, RLHF, or models used by millions is a big plus (relevant experience not required).
  • Take pride in work and thrive in meritocratic environments.

Compensation and Benefits

  • $180,000 - $600,000 USD
  • Equity, comprehensive medical, vision, and dental coverage
  • Access to 401(k) retirement plan
  • Short & long-term disability insurance
  • Life insurance
  • Various other discounts and perks

Skills

Reinforcement LearningRLHFDpoReward ModelingAi AlignmentPost-TrainingPyTorchJAXTransformersMachine Learning

Similar roles

ML Engineering jobs

Member of Technical Staff - X Search

Develops and operates large-scale search engine infrastructure, including retrieval algorithms, indexing, and ML ranking models integrated with Grok AI. Requires experience with search systems, vector databases, and production ML in Python, Go, or Rust.

180k – 440kPalo Alto, CAML EngineeringOn-siteGoRust

Member of Technical Staff - Multimodal Understanding

Develops large-scale distributed systems and pipelines for multimodal AI pre-training, post-training, and inference across image, video, audio, and text. Requires expert Python proficiency, experience with JAX/PyTorch/XLA, and scaling multimodal ML systems.

180k – 440kPalo Alto, CAML EngineeringOn-siteRlJAX

Staff Data Scientist | Modeling

Staff Data Scientist advances ML models for healthcare claims auditing, curating data, developing precise models, and optimizing business impact for health plans. Requires expertise in SQL, Python/R, and building ML models from scratch.

180k – 260kUnited StatesML EngineeringRemoteRSQL

Staff Software Engineer | GenAI & Agentic Workflows

Leads design and development of large-scale AI systems for document processing and agentic workflows in healthcare payment integrity. Requires 6+ years experience with Java/Python, productionizing LLMs/RAG, and distributed systems.

180k – 250kUnited StatesML EngineeringRemote6+ YOERAGRay

Staff Data Scientist | Fraud / Billing

Leads development and deployment of ML and GenAI systems to detect billing errors, audits, and fraud in healthcare payments. Requires 5+ years data science experience, team management, and a degree in CS or related field.

180k – 230kUnited StatesML EngineeringRemote5+ YOESQLPython