Skip to content

Engineering Manager, Runtime Fabric

165k – 330kSan Francisco, CAEngineering ManagementHybrid7+ YOE
Summary

Lead the Runtime Fabrics team building container runtime and storage layers purpose-built for AI inference workloads. Manage systems engineers, set technical direction for containerd/runc extensions, and drive open-source contributions.

About the role

Responsibilities

Team Leadership & Culture

  • Recruit, hire, and develop a high-performing team of systems engineers with deep container and Linux expertise
  • Foster a culture of technical rigor, open-source contribution, and continuous improvement
  • Provide regular coaching, feedback, and career development support to direct reports
  • Partner with engineering leadership to define the long-term vision and roadmap for container runtime and storage infrastructure

Technical Direction

  • Guide the team in extending and hardening containerd, runc, and related OCI ecosystem projects to meet the GPU-specific requirements of production AI inference, including startup performance, GPU device access, and multi-tenant isolation
  • Oversee the architecture and evolution of the Baseten Delivery Network: the tiered caching and weight delivery system that makes cold starts 2–3x faster and eliminates thundering herd failures during burst scaling events
  • Drive the expansion of BDN's architecture, currently focused on model weights, to container images, training checkpoints, and deployment artifacts
  • Provide technical oversight on GPU-aware isolation mechanisms for multi-tenant inference, including secure container runtimes, Linux namespace hardening, and longer-term micro-VM integration
  • Ensure the team maintains end-to-end ownership of the container startup performance path, from snapshotter initialization through weight delivery to first inference request
  • Champion the team's contributions back to the open-source containerd ecosystem alongside a team of core maintainers

Cross-Functional Partnership

  • Act as the primary advocate for Runtime Fabrics across the organization, ensuring upstream and downstream teams have the integration support they need
  • Collaborate with product and engineering stakeholders to prioritize investments based on business impact and infrastructure reliability
  • Communicate team progress, technical trade-offs, and architectural decisions clearly to leadership

Requirements

  • Proven experience managing and growing engineering teams in a systems, infrastructure, or low-level runtime context
  • Deep familiarity with the Linux container ecosystem: containerd, runc, OCI Runtime Spec, Linux namespaces, and cgroups, with the ability to engage credibly in code reviews and architectural discussions
  • Contributions to containerd/containerd, opencontainers/runc, google/gvisor, kata-containers/kata-containers, or closely related open-source projects
  • Strong systems programming background in Go and/or C/C++
  • Experience with distributed storage systems, content-addressable storage, or large-scale caching infrastructure
  • Understanding of how container images are structured, stored, and delivered at scale
  • Strong written and verbal communication skills, with the ability to influence without authority across teams

Nice to Have

  • Experience with GPU device access in containers: NVIDIA Container Toolkit, CDI (Container Device Interface), or GPU-aware scheduling
  • Familiarity with lazy-loading snapshotters (stargz, soci, EROFS/Nydus) or peer-to-peer image distribution
  • Experience with secure container runtimes (gVisor, Sysbox) or micro-VM technologies (Firecracker, Cloud Hypervisor)
  • Understanding of containerd's shim API (v2) and experience building custom shim implementations
  • Background in multi-tenant infrastructure or security-sensitive serving environments

Benefits

  • Competitive compensation, including meaningful equity
  • 100% coverage of medical, dental, and vision insurance for employee and dependents
  • Flexible PTO policy including company wide Winter Break
  • Paid parental leave
  • Fertility and family-building stipend through Carrot
  • Company-facilitated 401(k)
Skills
containerdruncOCI Runtime SpecLinux namespacescgroupsGoCC++distributed storage systemscontent-addressable storagelarge-scale caching infrastructurecontainer imagesGPU device accessNVIDIA Container ToolkitCDI
Similar roles at this salary range
All Engineering Management jobs →
PrizePicks

Engineering Manager

Engineering Manager responsible for leading a product team, owning technical direction, supervising work, and driving results from complex requirements. Requires 5-8 years experience and advanced domain programming skills.

195k – 200kUnited StatesEngineering ManagementRemote5+ YOERubyRoda
Mozilla

Sr Engineering Manager, Web Apps

Lead day-to-day engineering execution and people management for the Web Applications team building MZLA's hosted subscription products. Requires 15+ years experience with 10+ years engineering leadership and 5+ years people management.

175k – 200kUnited StatesEngineering ManagementRemote15+ YOESaaSCI/CD
Mozilla

Sr Engineering Manager, Web Apps

Lead day-to-day engineering execution and people management for the Web Applications team building MZLA's hosted subscription products. Requires 15+ years experience with 10+ years in engineering leadership and 5+ years people management.

175k – 200kUnited StatesEngineering ManagementRemote15+ YOECI/CDTesting
Bestow

Engineering Manager

Lead a team of backend engineers building and scaling Bestow's communications infrastructure including document generation and multi-channel notifications. Player/coach role spending ~20% of time on production code.

156k – 195kDallas, TXEngineering ManagementRemote8+ YOEGogRPC
Virta Health

Engineering Manager, Product Platform

Lead a senior autonomous engineering squad building and scaling Virta's core Product Platform services including communications, workflow engine, and AI platform. Drive platform adoption, SDLC innovation with AI-native tools, and technical strategy while coaching engineers toward Staff/Principal levels.

175k – 225kUnited StatesEngineering ManagementRemote3+ YOEAgentic CodingTeam Leadership