ML Research Engineer - Hardware Codesign

185k – 455kSan Francisco, CAHybridJan 13

Summary

Research-Hardware Codesign Engineer bridges ML research and silicon architecture, debugging performance gaps, writing quantization kernels, prototyping numerics in RTL, and analyzing system tradeoffs for AI-optimized hardware.

About the role

In this role you will:

Build on our roofline simulator to track evolving workloads, and deliver analyses that quantify the impact of system architecture decisions and support technology pathfinding.
Debug gaps between performance simulation and real measurements; clearly communicate root cause, bottlenecks, and invalid assumptions.
Write emulation kernels for low-precision numerics and lossy compression schemes, and get Research the information they need to trade efficiency with model quality.
Prototype numerics modules by pushing RTL through synthesis; hand off novel numerics cleanly, or occasionally own an RTL module end-to-end.
Proactively pull in new ML workloads, prototype them with rooflines and/or functional simulation, and drive initial evaluation of new opportunities or risks.
Understand the whole picture from ML science to hardware optimization, and slice this end-to-end objective into near-term deliverables.
Build ad-hoc collaborations across teams with very different goals and areas of expertise, and keep progress unblocked.
Communicate design tradeoffs clearly with explicit assumptions and confidence levels; produce a trail of evidence that enables confident execution.

You Will Thrive in this Role if:

An exceptional track record of high-quality technical output, and a bias for shipping a prototype now and iterating later in the absence of clear requirements.
Strong Python, and C++ or Rust, with a cautious attitude toward correctness and an intuition for clean extensibility.
Experience writing Triton, CUDA, or similar, and an understanding of the resulting mapping of tensor ops to functional units.
Working knowledge of PyTorch or JAX; experience in large ML codebases is a plus.
Practical understanding of floating point numerics, the ML tradeoffs of reduced precision, and the current state of the art in model quantization.
Deep understanding of transformer models, and strong intuition for transformer rooflines and the tradeoffs of sharded training and inference in large-scale ML systems.
Experience writing RTL (especially for floating point logic) and understanding of PPA tradeoffs is a plus.
Strong cross-functional communication (e.g. across ML researchers and hardware engineers); ability to slice ambiguous early-incubation ideas into concrete arenas in which progress can be made.

Skills

PythonC++RustTritonCUDAPyTorchJAXRTLroofline simulatorquantization

Similar roles at this salary range

All ML Engineering jobs →

Databricks

Jun 8

Staff Software Engineer, AI Runtime

Staff Software Engineer building and scaling Databricks' managed large-scale GPU training platform (AIR). Focus on distributed training performance, scheduling, fault tolerance, and developer experience for thousands of accelerators.

190k – 265kMountain View, CA +1ML EngineeringOn-siteFSDPRoCE

Databricks

Jun 8

Senior Software Engineer, AI Runtime

Senior Software Engineer building and scaling Databricks' managed GPU training platform (AI Runtime) for large-scale distributed AI model training. Requires 5+ years in distributed systems and hands-on experience with GPU training frameworks.

160k – 225kMountain View, CA +1ML EngineeringOn-siteFSDPRoCE

Jun 8

Sr. Machine Learning Engineer, Computer Vision

Build and prototype diffusion-based text-to-image generative models (Pinterest Canvas) using large-scale visual-text datasets. Requires 5+ years industry computer vision experience and an M.S. or Ph.D.

161k – 332kSan Francisco, CAML EngineeringRemoteRLHFPyTorch

Checkr

Jun 8

Machine Learning Engineer

Build and ship production ML/AI services powering background checks. Own end-to-end ML systems using LLMs, Python, and modern MLOps practices.

168k – 198kSan Francisco, CAML EngineeringOn-siteNLPdbt

Chime

Jun 8

Senior AI/ML Engineer

Senior AI/ML Engineer building transformer and deep learning models on financial and behavioral data to power personalized growth and marketing experiences at Chime. Requires strong production ML experience with PyTorch, AWS, and large-scale data infrastructure.

172k – 238kChicago, IL +3ML EngineeringHybridSQLAWS

Apply