Member of Technical Staff - Pre-Training
Designs and implements petabyte-scale data processing systems and pipelines for pre-training large language models, focusing on high-throughput CPU/GPU processing, data quality, and multi-cloud management. Requires strong systems skills in distributed data systems.
Responsibilities
- Design and implement petabyte-scale, high-throughput data processing systems that involve both CPU- and GPU-based processing.
- Design and implement tools for orchestrating complex data pipelines.
- Design and implement innovative tools for improving data discoverability and data quality at scale for both pre-training and post-training across different modalities.
- Build, run, and manage innovative data pipelines for creating high-quality training data.
Basic Qualifications
- Strong systems skills in configuring and troubleshooting complex distributed data processing systems for maximum performance.
- Building bespoke data processing systems from scratch.
- Preparing pre-training and post-training data for state-of-the-art large language models and generative models.
- Organizing and meticulously bookkeeping data across multiple clouds, of multiple modalities, and from many sources.
Compensation and Benefits
- $180,000 - $440,000 USD
- Base salary is just one part of our total rewards package at xAI, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks.
Staff Data Platform Engineer
Staff Data Platform Engineer building and leading AWS-native data platform architecture, orchestration, governance, and AI-readiness for analytics and ML workloads. Requires 8-10+ years experience with AWS data systems and strong technical leadership.
Manager, Data Engineering
Lead and mentor a team of data engineers building scalable data pipelines and platform infrastructure. Hands-on coding, operational excellence, and cross-functional collaboration with analytics, data science, and business teams.
Senior Data Engineer, People Analytics
Build and maintain data pipelines, tables, and AI-ready data foundations from HR systems to power People Analytics reporting, dashboards, and LLM tools. Requires 5+ years of data engineering experience with strong SQL, Python, Airflow, and data governance skills.