Skip to content

Engineering Manager, Model Inference

220k – 270kSan Francisco, CAEngineering ManagementHybrid5+ YOE
Summary

Engineering Manager leading the Model Inference team, responsible for architecting and scaling low-latency, high-throughput LLM serving infrastructure and growing a team of AI inference engineers.

About the role

What You’ll Do

  • Lead and grow a high-performing team of AI inference engineers focused on building and scaling infrastructure for Abridge’s products and APIs
  • Own the technical direction of our inference systems—making key decisions around batching, throughput, latency, and GPU utilization
  • Architect and scale inference infrastructure for reliability, efficiency, and observability; lead incident response
  • Benchmark and eliminate bottlenecks throughout the inference stack
  • Partner with ML Research teams on model optimization, quantization, and deployment
  • Develop APIs for AI inference used by both internal teams and external customers
  • Recruit, mentor, and develop engineering talent; establish team processes, engineering standards, and operational excellence
  • Work closely with the GenAI Platform, Data, and Product teams to plan and execute projects that directly impact clinicians and patients

What You’ll Bring

  • 5+ years of engineering experience with 1+ years in a technical leadership or management role
  • Deep, hands-on experience with ML systems and inference frameworks (e.g., PyTorch, TensorRT, vLLM, TensorFlow)
  • Strong understanding of LLM architecture (e.g. Multi-Head Attention, Multi/Grouped-Query Attention, and common transformer components)
  • Experience with inference optimizations (e.g. batching, quantization, kernel fusion, FlashAttention)
  • Familiarity with GPU characteristics, roofline models, and performance analysis
  • Experience deploying reliable, distributed, real-time systems at scale
  • Experience with parallelism strategies: tensor parallelism, pipeline parallelism, expert parallelism
  • Skilled at hiring and mentorship, with a demonstrated track record of helping engineers grow their skills and careers
  • Strong technical communication and cross-functional collaboration skills
  • Comfortable giving constructive feedback on technical designs and code reviews
  • Has thrived in a fast-growing startup and knows how to operate with urgency and focus

Added Bonus

  • Background in training infrastructure and RL workloads
  • Skilled in building secure, compliant systems on major cloud platforms (GCP preferred, AWS experience welcome)
  • Experience with Kubernetes and container orchestration at scale
  • Published work or contributions to inference optimization research
Skills
PyTorchTensorRTvLLMTensorFlowLLM architecturebatchingquantizationkernel fusionFlashAttentionGPU performance analysistensor parallelismpipeline parallelismexpert parallelismKubernetesGCP
Similar roles at this salary range
All Engineering Management jobs →
Wrapbook

Senior Director, Engineering - Agentic Business Systems

Lead internal AI platform and agentic workflow deployment across business functions. Own infrastructure, ship high-impact automations, and manage a mixed engineering/product/business team reporting to the CEO.

212k – 325kUnited StatesEngineering ManagementRemote8+ YOEAgentic systemsAI infrastructure
Crusoe

Senior Staff Software Engineer, Managed Platform Services

Senior technical leader anchoring distributed systems depth across Crusoe Cloud's Managed Platform Services. Owns performance engineering, operational excellence, and long-term architecture for 10x scale across all platform domains.

245k – 295kSan Francisco, CAEngineering ManagementOn-site7+ YOEGoOn-call
Applied Intuition

Verification and Validation Manager - Autonomy Trucking

Lead and grow a team defining validation strategies for L4 autonomy programs in trucking. Set technical direction, mentor engineers, and enable safe fleet deployment across the US and Japan.

220k – 320kSunnyvale, CAEngineering ManagementOn-site5+ YOEADASUL 4600
Cribl

Manager, Software Engineering, Search Discovery

Lead and grow a senior engineering team building dashboards, notebooks, visualizations, and AI-assisted investigation tools for Cribl Search. Partner with Product on roadmaps and mentor staff+ engineers in a fast-paced environment.

215k – 255kUnited StatesEngineering ManagementRemote7+ YOEnotebooksdashboards
GlossGenius

Engineering Manager

Engineering Manager for a Core Product team building AI-powered scheduling, payments, and client management systems. Owns execution, team health, and AI tooling adoption while partnering closely with Product and Design.

200k – 240kSan Francisco, CAEngineering ManagementHybrid5+ YOEAI toolingCode review