Member of Technical Staff - Pre-Training

180k – 440kPalo Alto, CAOnsiteApr 15

Summary

Designs and implements petabyte-scale data processing systems and pipelines for pre-training large language models, focusing on high-throughput CPU/GPU processing, data quality, and multi-cloud management. Requires strong systems skills in distributed data systems.

About the role

Responsibilities

Design and implement petabyte-scale, high-throughput data processing systems that involve both CPU- and GPU-based processing.
Design and implement tools for orchestrating complex data pipelines.
Design and implement innovative tools for improving data discoverability and data quality at scale for both pre-training and post-training across different modalities.
Build, run, and manage innovative data pipelines for creating high-quality training data.

Basic Qualifications

Strong systems skills in configuring and troubleshooting complex distributed data processing systems for maximum performance.
Building bespoke data processing systems from scratch.
Preparing pre-training and post-training data for state-of-the-art large language models and generative models.
Organizing and meticulously bookkeeping data across multiple clouds, of multiple modalities, and from many sources.

Compensation and Benefits

$180,000 - $440,000 USD
Base salary is just one part of our total rewards package at xAI, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks.

Skills

Distributed SystemsGPU ProcessingCPU ProcessingData PipelinesKubernetesLarge Language ModelsGenerative ModelsMulti-CloudData ProcessingPetabyte-Scale Systems

Similar roles at this salary range

All Data Engineering jobs →

Honor

Jun 8

Staff Data Platform Engineer

Staff Data Platform Engineer building and leading AWS-native data platform architecture, orchestration, governance, and AI-readiness for analytics and ML workloads. Requires 8-10+ years experience with AWS data systems and strong technical leadership.

194k – 220kUnited StatesData EngineeringRemotedbtPython

Instacart

Jun 8

Senior Data Engineer II, Finance

Senior data engineer building and owning financial data pipelines, models, and ETL/ELT systems for accounting, billing, and revenue reporting at Instacart.

183k – 232kUnited StatesData EngineeringRemoteSQLdbt

Justworks

Jun 8

Manager, Data Engineering

Lead and mentor a team of data engineers building scalable data pipelines and platform infrastructure. Hands-on coding, operational excellence, and cross-functional collaboration with analytics, data science, and business teams.

205k – 262kNew York, NYData EngineeringHybridSQLAWS

Airbnb

Jun 8

Senior Data Engineer, People Analytics

Build and maintain data pipelines, tables, and AI-ready data foundations from HR systems to power People Analytics reporting, dashboards, and LLM tools. Requires 5+ years of data engineering experience with strong SQL, Python, Airflow, and data governance skills.

179k – 210kUnited StatesData EngineeringRemoteSQLAWS

Nuance Labs

Jun 5

Member of Technical Staff — ML Data Infra

Build and operate large-scale multimodal data pipelines for AI avatar model training. Design production-grade systems for petabyte-scale video, audio, and text data.

200k – 300kSeattle, WAData EngineeringOn-siteRayDVC

Apply