Engineering Manager, Model Inference

220k – 270kSan Francisco, CAEngineering ManagementHybrid5+ YOEMay 20

Summary

Engineering Manager leading the Model Inference team, responsible for architecting and scaling low-latency, high-throughput LLM serving infrastructure and growing a team of AI inference engineers.

About the role

What You’ll Do

Lead and grow a high-performing team of AI inference engineers focused on building and scaling infrastructure for Abridge’s products and APIs
Own the technical direction of our inference systems—making key decisions around batching, throughput, latency, and GPU utilization
Architect and scale inference infrastructure for reliability, efficiency, and observability; lead incident response
Benchmark and eliminate bottlenecks throughout the inference stack
Partner with ML Research teams on model optimization, quantization, and deployment
Develop APIs for AI inference used by both internal teams and external customers
Recruit, mentor, and develop engineering talent; establish team processes, engineering standards, and operational excellence
Work closely with the GenAI Platform, Data, and Product teams to plan and execute projects that directly impact clinicians and patients

What You’ll Bring

5+ years of engineering experience with 1+ years in a technical leadership or management role
Deep, hands-on experience with ML systems and inference frameworks (e.g., PyTorch, TensorRT, vLLM, TensorFlow)
Strong understanding of LLM architecture (e.g. Multi-Head Attention, Multi/Grouped-Query Attention, and common transformer components)
Experience with inference optimizations (e.g. batching, quantization, kernel fusion, FlashAttention)
Familiarity with GPU characteristics, roofline models, and performance analysis
Experience deploying reliable, distributed, real-time systems at scale
Experience with parallelism strategies: tensor parallelism, pipeline parallelism, expert parallelism
Skilled at hiring and mentorship, with a demonstrated track record of helping engineers grow their skills and careers
Strong technical communication and cross-functional collaboration skills
Comfortable giving constructive feedback on technical designs and code reviews
Has thrived in a fast-growing startup and knows how to operate with urgency and focus

Added Bonus

Background in training infrastructure and RL workloads
Skilled in building secure, compliant systems on major cloud platforms (GCP preferred, AWS experience welcome)
Experience with Kubernetes and container orchestration at scale
Published work or contributions to inference optimization research

Skills

PyTorchTensorRTvLLMTensorFlowLLM architecturebatchingquantizationkernel fusionFlashAttentionGPU performance analysistensor parallelismpipeline parallelismexpert parallelismKubernetesGCP

Similar roles at this salary range

All Engineering Management jobs →

Wrapbook

Jun 24

Senior Director, Engineering - Agentic Business Systems

Lead internal AI platform and agentic workflow deployment across business functions. Own infrastructure, ship high-impact automations, and manage a mixed engineering/product/business team reporting to the CEO.

212k – 325kUnited StatesEngineering ManagementRemote8+ YOEAgentic systemsAI infrastructure

Crusoe

Jun 24

Senior Staff Software Engineer, Managed Platform Services

Senior technical leader anchoring distributed systems depth across Crusoe Cloud's Managed Platform Services. Owns performance engineering, operational excellence, and long-term architecture for 10x scale across all platform domains.

245k – 295kSan Francisco, CAEngineering ManagementOn-site7+ YOEGoOn-call

Applied Intuition

Jun 24

Verification and Validation Manager - Autonomy Trucking

Lead and grow a team defining validation strategies for L4 autonomy programs in trucking. Set technical direction, mentor engineers, and enable safe fleet deployment across the US and Japan.

220k – 320kSunnyvale, CAEngineering ManagementOn-site5+ YOEADASUL 4600

Cribl

Jun 24

Manager, Software Engineering, Search Discovery

Lead and grow a senior engineering team building dashboards, notebooks, visualizations, and AI-assisted investigation tools for Cribl Search. Partner with Product on roadmaps and mentor staff+ engineers in a fast-paced environment.

215k – 255kUnited StatesEngineering ManagementRemote7+ YOEnotebooksdashboards

GlossGenius

Jun 24

Engineering Manager

Engineering Manager for a Core Product team building AI-powered scheduling, payments, and client management systems. Owns execution, team health, and AI tooling adoption while partnering closely with Product and Design.

200k – 240kSan Francisco, CAEngineering ManagementHybrid5+ YOEAI toolingCode review

Apply