Skip to content

Member of Technical Staff - Large Scale Data Infrastructure

Builds scalable data infrastructure for peta-to-exabyte scale training on thousands of GPUs, including data loaders, petabyte storage systems, multi-cloud abstractions, and performance debugging for AI models.

180k – 300kSan Francisco, CAData EngineeringHybrid

About the role

What You’ll Work On

  • Scalable data loaders for training runs across thousands of GPUs
  • Efficient storage and retrieval systems for petabyte-scale datasets
  • Multi-cloud object storage abstraction
  • Execute large-scale data migrations across storage systems and providers
  • Debug and resolve performance bottlenecks in distributed data loading

Technical Focus

  • Python, PyTorch DataLoader internals
  • Object storage (e.g. S3, Azure Blob, GCS)
  • Parquet for metadata
  • Video: ffmpeg, PyAV, codec fundamentals

What We’re Looking For

  • Built and operated data pipelines at petabyte scale
  • Optimized data loading
  • Worked with petabyte-scale video and image datasets
  • Written processing jobs operating on millions of files
  • Debugged distributed system bottlenecks across large fleets of machines

Nice to have

  • Experience streaming dataset formats (e.g. WebDataset)
  • Video codec internals and frame-accurate seeking
  • Distributed systems experience
  • Slurm and Kubernetes for job orchestration
  • Experience with object storage performance tuning across providers

Base Annual Salary (SF based role): $180,000–$300,000 USD + Equity

Skills

PythonPyTorchS3GcsAzure BlobParquetFfmpegPyavKubernetesSlurmWebdataset

Member of Technical Staff - Pre-Training

Designs and implements petabyte-scale data processing systems and pipelines for pre-training large language models, focusing on high-throughput CPU/GPU processing, data quality, and multi-cloud management. Requires strong systems skills in distributed data systems.

180k – 440kPalo Alto, CAData EngineeringOn-siteLLMsKubernetes

Senior Staff Engineer, Operations Analysis (R4487)

Leads modeling, simulation, and wargaming to evaluate autonomous aircraft performance, survivability, and mission impact in military scenarios. Collaborates with engineering and DoD stakeholders using tools like AFSIM, STK, MATLAB, and Python; requires 10+ years experience and security clearance.

181k – 271kWashington, DCData EngineeringOn-site10+ YOEStkIsr

Staff Data Engineer

Staff Data Engineer architects and delivers scalable data products from healthcare datasets, designs high-performance processing systems using SQL, Spark, Python, and AI workflows, and leads cross-functional initiatives for reliable data serving to customers and applications.

181k – 282kUnited StatesData EngineeringRemoteSQLC++

Staff Software Engineer, Batch Processing Platform

Designs, implements, and optimizes high-performance batch processing infrastructure handling hundreds of petabytes using Spark, Presto/Trino, and Iceberg. Requires 6+ years in scalable big data systems and proficiency in Java, Scala, or Python.

177k – 365kSeattle, WAData EngineeringRemote6+ YOEJavaTrino

Staff Data Engineer

Staff Data Engineer owns and evolves data platforms including warehouse architecture, pipelines, and modeling to enable scalable analytics and self-service insights. Requires 7+ years experience, advanced SQL/Python, and expertise with managed data warehouses like Snowflake.

177k – 226kSan Francisco, CAData EngineeringOn-site7+ YOESQLETL