ML Runtime Optimization Engineer

159k – 199kSunnyvale, CAOnsite3+ YOEApr 27

Summary

Optimizes ML models for performance on embedded compute platforms in ADAS/AD stacks, focusing on inference efficiency, pruning, quantization, and profiling across GPU/CPU/SoC architectures. Requires 3+ years experience with deep learning frameworks and embedded systems.

About the role

Responsibilities

Drive ML performance optimization on multiple technologies for on-road and off-road ADAS / AD stacks targeting deployment on a variety of embedded compute platforms
Develop compute usage strategies to optimize efficiency and latency of model inference for compute boards selected by our customers
Work on model pruning and quantization, and support deployment on memory constrained platforms
Collaborate closely with ML engineers and software developers on technical efforts to find and optimize efficient model architecture solutions
Set up methodologies to profile the model performance on target embedded compute platforms and identify performance bottlenecks as part of stack integration

Requirements

Bachelors in Electrical Engineering or Computer Science, OR B.Sc. in Computer Science, Mathematics, Physics or a related field
3+ years of experience with ML accelerators, GPU, CPU, SoC architecture and micro-architecture
Strong software development skills with the focus on embedded programming
Experience profiling and optimizing model performance on embedded compute platforms
Experience in working with deep learning frameworks (e.g., PyTorch, JAX, ONNX, etc.)

Nice to Have

M.Sc or PhD in a ML related area
Built an ML optimization framework from scratch before
Deployed ML solutions to embedded chips for real time robotics applications

Compensation

Base salary range: $159,053 - $199,295 USD annually
Equity, comprehensive health, dental, vision, life and disability insurance, 401k with employer match, learning and wellness stipends, paid time off

Skills

PyTorchJAXONNXTensorRTCUDAXLATritonGPUML acceleratorsembedded programming

Similar roles at this salary range

All ML Engineering jobs →

Databricks

Jun 8

Staff Software Engineer, AI Runtime

Staff Software Engineer building and scaling Databricks' managed large-scale GPU training platform (AIR). Focus on distributed training performance, scheduling, fault tolerance, and developer experience for thousands of accelerators.

190k – 265kMountain View, CA +1ML EngineeringOn-siteFSDPRoCE

Databricks

Jun 8

Senior Software Engineer, AI Runtime

Senior Software Engineer building and scaling Databricks' managed GPU training platform (AI Runtime) for large-scale distributed AI model training. Requires 5+ years in distributed systems and hands-on experience with GPU training frameworks.

160k – 225kMountain View, CA +1ML EngineeringOn-siteFSDPRoCE

Jun 8

Sr. Machine Learning Engineer, Computer Vision

Build and prototype diffusion-based text-to-image generative models (Pinterest Canvas) using large-scale visual-text datasets. Requires 5+ years industry computer vision experience and an M.S. or Ph.D.

161k – 332kSan Francisco, CAML EngineeringRemoteRLHFPyTorch

Checkr

Jun 8

Machine Learning Engineer

Build and ship production ML/AI services powering background checks. Own end-to-end ML systems using LLMs, Python, and modern MLOps practices.

168k – 198kSan Francisco, CAML EngineeringOn-siteNLPdbt

Chime

Jun 8

Senior AI/ML Engineer

Senior AI/ML Engineer building transformer and deep learning models on financial and behavioral data to power personalized growth and marketing experiences at Chime. Requires strong production ML experience with PyTorch, AWS, and large-scale data infrastructure.

172k – 238kChicago, IL +3ML EngineeringHybridSQLAWS

Apply