ML Engineer, Generative Video
Build and scale video generation models at an AI-native video platform. Focus on training, inference optimization, and productionizing large-scale multimodal models.
Responsibilities
- Train and optimize large-scale video and multimodal models
- Improve efficiency across training and inference (memory, latency, cost)
- Implement techniques such as distillation, quantization, and pruning to aggressively accelerate diffusion and autoregressive generation
- Build and maintain distributed training systems
- Optimize GPU utilization, parallelism, and throughput
- Develop tooling for experimentation, evaluation, and debugging
- Translate research models into robust, production-ready systems
- Monitor and improve model performance in real-world usage
Requirements
- BS/MS/PhD in CS, ML, or related field
- 2+ years of professional industry experience
- Strong experience in deep learning systems and infrastructure
- Expertise in PyTorch, CUDA, Triton, and distributed training (FSDP, etc.)
- Experience scaling and optimizing large models under low-latency inference constraints
- Strong debugging and performance profiling skills
- Ability to move quickly from prototype to production
Benefits
- Comprehensive medical, dental, and vision plans
- 401K with employer match
- Commuter Benefits
- Catered lunch multiple days per week
- Dinner stipend every night if you're working late
- Grubhub subscription
- Health & Wellness Perks
- Multiple team offsites per year with team events every month
- Generous PTO policy
Staff ML Engineer
Founding Staff ML Engineer building production ML systems for governance, security, and agentic platform capabilities at Docker. Owns architecture, data pipelines, evaluation, and model lifecycle while mentoring the growing team.
Member of Technical Staff - Research Fellow
3-month research fellowship for early-career researchers working on frontier Multimodal LLMs, generative modeling, and real-time audiovisual AI. Own a research problem in pretraining, post-training, RL, evaluation, or multimodal modeling. Strong PyTorch and first-author tier-1 paper required.
Senior Software Engineer — LLM Post-Training Platform
Build and scale Snowflake's Cortex Training LLM post-training platform, handling distributed GPU scheduling, orchestration, and productionizing research for enterprise-scale model adaptation.