Skip to content

Member of Technical Staff - Pre-Training

180k – 440kPalo Alto, CAOnsite
Summary

Designs and implements petabyte-scale data processing systems and pipelines for pre-training large language models, focusing on high-throughput CPU/GPU processing, data quality, and multi-cloud management. Requires strong systems skills in distributed data systems.

About the role

Responsibilities

  • Design and implement petabyte-scale, high-throughput data processing systems that involve both CPU- and GPU-based processing.
  • Design and implement tools for orchestrating complex data pipelines.
  • Design and implement innovative tools for improving data discoverability and data quality at scale for both pre-training and post-training across different modalities.
  • Build, run, and manage innovative data pipelines for creating high-quality training data.

Basic Qualifications

  • Strong systems skills in configuring and troubleshooting complex distributed data processing systems for maximum performance.
  • Building bespoke data processing systems from scratch.
  • Preparing pre-training and post-training data for state-of-the-art large language models and generative models.
  • Organizing and meticulously bookkeeping data across multiple clouds, of multiple modalities, and from many sources.

Compensation and Benefits

  • $180,000 - $440,000 USD
  • Base salary is just one part of our total rewards package at xAI, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks.
Skills
Distributed SystemsGPU ProcessingCPU ProcessingData PipelinesKubernetesLarge Language ModelsGenerative ModelsMulti-CloudData ProcessingPetabyte-Scale Systems
Similar roles at this salary range
All Data Engineering jobs →
Honor

Staff Data Platform Engineer

Staff Data Platform Engineer building and leading AWS-native data platform architecture, orchestration, governance, and AI-readiness for analytics and ML workloads. Requires 8-10+ years experience with AWS data systems and strong technical leadership.

194k – 220kUnited StatesData EngineeringRemotedbtPython
Instacart

Senior Data Engineer II, Finance

Senior data engineer building and owning financial data pipelines, models, and ETL/ELT systems for accounting, billing, and revenue reporting at Instacart.

183k – 232kUnited StatesData EngineeringRemoteSQLdbt
Justworks

Manager, Data Engineering

Lead and mentor a team of data engineers building scalable data pipelines and platform infrastructure. Hands-on coding, operational excellence, and cross-functional collaboration with analytics, data science, and business teams.

205k – 262kNew York, NYData EngineeringHybridSQLAWS
Airbnb

Senior Data Engineer, People Analytics

Build and maintain data pipelines, tables, and AI-ready data foundations from HR systems to power People Analytics reporting, dashboards, and LLM tools. Requires 5+ years of data engineering experience with strong SQL, Python, Airflow, and data governance skills.

179k – 210kUnited StatesData EngineeringRemoteSQLAWS
Nuance Labs

Member of Technical Staff — ML Data Infra

Build and operate large-scale multimodal data pipelines for AI avatar model training. Design production-grade systems for petabyte-scale video, audio, and text data.

200k – 300kSeattle, WAData EngineeringOn-siteRayDVC