Research, Vision Expertise
Conducts research on visual perception, multimodal learning, and large-scale AI model training. Designs architectures, builds datasets and evaluations, and collaborates on frontier models. Requires ML expertise, Python proficiency, and experimental rigor.
What You’ll Do
- Own research projects on training and performance analysis of multimodal AI models.
- Curate and build large-scale datasets and evaluation benchmarks to advance vision capabilities.
- Work with our data infrastructure engineers, pretraining researchers and engineers, and product team to create frontier multimodal models and the products that leverage them.
- Publish and present research that moves the entire community forward. Share code, datasets, and insights that accelerate progress across industry and academia.
Skills and Qualifications
Minimum qualifications:
- Ability to design, run, and analyze experiments thoughtfully, with demonstrated research judgment and empirical rigor.
- Understanding of machine learning fundamentals, large-scale training, and distributed compute environments.
- Proficiency in Python and familiarity with at least one deep learning framework (e.g., PyTorch, TensorFlow, or JAX). Comfortable with debugging distributed training and writing code that scales.
- Bachelor’s degree or equivalent experience in Computer Science, Machine Learning, Physics, Mathematics, or a related discipline with strong theoretical and empirical grounding.
- Clarity in communication, an ability to explain complex technical concepts in writing.
Preferred qualifications:
- Research or engineering contributions in visual reasoning, spatial understanding, or multimodal architecture design.
- Experience developing evaluation frameworks for multimodal tasks.
- Publications or open-source contributions in vision-language modeling, video understanding, or multimodal AI.
- A strong grasp of probability, statistics, and ML fundamentals. You can look at experimental data and distinguish between real effects, noise, and bugs.
- PhD in Computer Science, Machine Learning, Physics, Mathematics, or a related discipline with strong theoretical and empirical grounding; or, equivalent industry research experience.
Logistics
Compensation: Depending on background, skills and experience, the expected annual salary range for this position is $350,000 - $475,000 USD.
Benefits: Generous health, dental, and vision benefits, unlimited PTO, paid parental leave, and relocation support as needed.
Member of Technical Staff
Hands-on technical contributor focused on stabilizing and advancing large language model training, fine-tuning, and research in AI/deep learning. Requires a bachelor's degree and 2+ years of experience with distributed systems, ML infrastructure, and programming in Rust/C++/Python.
Research Engineer / Research Scientist
Research and develop improvements to models' personalization and agentic capabilities through reinforcement learning, dataset creation, and post-training methods. Requires strong ML engineering skills and research experience with novel models.
Research Engineer/Research Scientist, Audio
Research Engineer/Scientist role focused on advancing audio capabilities in large language models, including training speech/audio models, developing codecs, and building conversational AI systems. Requires strong experience in audio ML research and engineering with JAX or PyTorch.