ML Runtime Optimization Engineer
Optimizes ML models for performance on embedded compute platforms in ADAS/AD stacks, focusing on inference efficiency, pruning, quantization, and profiling across GPU/CPU/SoC architectures. Requires 3+ years experience with deep learning frameworks and embedded systems.
Responsibilities
- Drive ML performance optimization on multiple technologies for on-road and off-road ADAS / AD stacks targeting deployment on a variety of embedded compute platforms
- Develop compute usage strategies to optimize efficiency and latency of model inference for compute boards selected by our customers
- Work on model pruning and quantization, and support deployment on memory constrained platforms
- Collaborate closely with ML engineers and software developers on technical efforts to find and optimize efficient model architecture solutions
- Set up methodologies to profile the model performance on target embedded compute platforms and identify performance bottlenecks as part of stack integration
Requirements
- Bachelors in Electrical Engineering or Computer Science, OR B.Sc. in Computer Science, Mathematics, Physics or a related field
- 3+ years of experience with ML accelerators, GPU, CPU, SoC architecture and micro-architecture
- Strong software development skills with the focus on embedded programming
- Experience profiling and optimizing model performance on embedded compute platforms
- Experience in working with deep learning frameworks (e.g., PyTorch, JAX, ONNX, etc.)
Nice to Have
- M.Sc or PhD in a ML related area
- Built an ML optimization framework from scratch before
- Deployed ML solutions to embedded chips for real time robotics applications
Compensation
- Base salary range: $159,053 - $199,295 USD annually
- Equity, comprehensive health, dental, vision, life and disability insurance, 401k with employer match, learning and wellness stipends, paid time off
Staff Software Engineer, AI Runtime
Staff Software Engineer building and scaling Databricks' managed large-scale GPU training platform (AIR). Focus on distributed training performance, scheduling, fault tolerance, and developer experience for thousands of accelerators.
Senior Software Engineer, AI Runtime
Senior Software Engineer building and scaling Databricks' managed GPU training platform (AI Runtime) for large-scale distributed AI model training. Requires 5+ years in distributed systems and hands-on experience with GPU training frameworks.
Sr. Machine Learning Engineer, Computer Vision
Build and prototype diffusion-based text-to-image generative models (Pinterest Canvas) using large-scale visual-text datasets. Requires 5+ years industry computer vision experience and an M.S. or Ph.D.
Senior AI/ML Engineer
Senior AI/ML Engineer building transformer and deep learning models on financial and behavioral data to power personalized growth and marketing experiences at Chime. Requires strong production ML experience with PyTorch, AWS, and large-scale data infrastructure.