# Member of Technical Staff - Large Scale Data Infrastructure
**Company:** [Black Forest Labs](https://hotfix.jobs/companies/black-forest-labs)
**Location:** San Francisco, CA
**Salary:** $180K-$300K
**Skills:** Python, PyTorch, S3, Gcs, Azure Blob, Parquet, Ffmpeg, Pyav, Kubernetes, Slurm, Webdataset
**Posted:** 2026-04-16
> Builds scalable data infrastructure for peta-to-exabyte scale training on thousands of GPUs, including data loaders, petabyte storage systems, multi-cloud abstractions, and performance debugging for AI models.
## Job Description
## What You’ll Work On
- Scalable data loaders for training runs across thousands of GPUs
- Efficient storage and retrieval systems for petabyte-scale datasets
- Multi-cloud object storage abstraction
- Execute large-scale data migrations across storage systems and providers
- Debug and resolve performance bottlenecks in distributed data loading

## Technical Focus
- **Python, PyTorch DataLoader internals**
- **Object storage** (e.g. S3, Azure Blob, GCS)
- **Parquet** for metadata
- **Video**: ffmpeg, PyAV, codec fundamentals

## What We’re Looking For
- Built and operated data pipelines at petabyte scale
- Optimized data loading
- Worked with petabyte-scale video and image datasets
- Written processing jobs operating on millions of files
- Debugged distributed system bottlenecks across large fleets of machines

## Nice to have
- Experience streaming dataset formats (e.g. WebDataset)
- Video codec internals and frame-accurate seeking
- Distributed systems experience
- Slurm and Kubernetes for job orchestration
- Experience with object storage performance tuning across providers

**Base Annual Salary (SF based role): $180,000–$300,000 USD + Equity**
**Apply:** https://hotfix.jobs/jobs/member-of-technical-staff-large-scale-data-infrastructure-at-black-forest-labs-d024dcf6-ea31-49f2-bd83-de2075b36b63
**Canonical:** https://hotfix.jobs/jobs/member-of-technical-staff-large-scale-data-infrastructure-at-black-forest-labs-d024dcf6-ea31-49f2-bd83-de2075b36b63