Skip to content

ML Runtime Optimization Engineer

159k – 199kSunnyvale, CAOnsite3+ YOE
Summary

Optimizes ML models for performance on embedded compute platforms in ADAS/AD stacks, focusing on inference efficiency, pruning, quantization, and profiling across GPU/CPU/SoC architectures. Requires 3+ years experience with deep learning frameworks and embedded systems.

About the role

Responsibilities

  • Drive ML performance optimization on multiple technologies for on-road and off-road ADAS / AD stacks targeting deployment on a variety of embedded compute platforms
  • Develop compute usage strategies to optimize efficiency and latency of model inference for compute boards selected by our customers
  • Work on model pruning and quantization, and support deployment on memory constrained platforms
  • Collaborate closely with ML engineers and software developers on technical efforts to find and optimize efficient model architecture solutions
  • Set up methodologies to profile the model performance on target embedded compute platforms and identify performance bottlenecks as part of stack integration

Requirements

  • Bachelors in Electrical Engineering or Computer Science, OR B.Sc. in Computer Science, Mathematics, Physics or a related field
  • 3+ years of experience with ML accelerators, GPU, CPU, SoC architecture and micro-architecture
  • Strong software development skills with the focus on embedded programming
  • Experience profiling and optimizing model performance on embedded compute platforms
  • Experience in working with deep learning frameworks (e.g., PyTorch, JAX, ONNX, etc.)

Nice to Have

  • M.Sc or PhD in a ML related area
  • Built an ML optimization framework from scratch before
  • Deployed ML solutions to embedded chips for real time robotics applications

Compensation

  • Base salary range: $159,053 - $199,295 USD annually
  • Equity, comprehensive health, dental, vision, life and disability insurance, 401k with employer match, learning and wellness stipends, paid time off
Skills
PyTorchJAXONNXTensorRTCUDAXLATritonGPUML acceleratorsembedded programming
Similar roles at this salary range
All ML Engineering jobs →
Databricks

Staff Software Engineer, AI Runtime

Staff Software Engineer building and scaling Databricks' managed large-scale GPU training platform (AIR). Focus on distributed training performance, scheduling, fault tolerance, and developer experience for thousands of accelerators.

190k – 265kMountain View, CA +1ML EngineeringOn-siteFSDPRoCE
Databricks

Senior Software Engineer, AI Runtime

Senior Software Engineer building and scaling Databricks' managed GPU training platform (AI Runtime) for large-scale distributed AI model training. Requires 5+ years in distributed systems and hands-on experience with GPU training frameworks.

160k – 225kMountain View, CA +1ML EngineeringOn-siteFSDPRoCE
Pinterest

Sr. Machine Learning Engineer, Computer Vision

Build and prototype diffusion-based text-to-image generative models (Pinterest Canvas) using large-scale visual-text datasets. Requires 5+ years industry computer vision experience and an M.S. or Ph.D.

161k – 332kSan Francisco, CAML EngineeringRemoteRLHFPyTorch
Checkr

Machine Learning Engineer

Build and ship production ML/AI services powering background checks. Own end-to-end ML systems using LLMs, Python, and modern MLOps practices.

168k – 198kSan Francisco, CAML EngineeringOn-siteNLPdbt
Chime

Senior AI/ML Engineer

Senior AI/ML Engineer building transformer and deep learning models on financial and behavioral data to power personalized growth and marketing experiences at Chime. Requires strong production ML experience with PyTorch, AWS, and large-scale data infrastructure.

172k – 238kChicago, IL +3ML EngineeringHybridSQLAWS