Skip to content

Senior Manager, Infrastructure Platform Engineering

Lead a team building core infrastructure platform services for large-scale compute capacity allocation, state management, and security. Requires 10+ years infrastructure experience and 3+ years in engineering leadership.

245k – 295kSan Francisco, CASunnyvale, CAEngineering ManagementOnsite10+ YOE

About the role

What You'll Be Working On

  • Leading the team responsible for the platform services that abstract underlying infrastructure into reliable, allocatable capacity, and for the systems that track and reconcile state across a large fleet
  • Setting the technical roadmap across capacity and utilization intelligence, resource lifecycle and state management, and platform security and trust frameworks
  • Driving the design of secure, well-instrumented platform systems — from Kubernetes-based orchestration and automation to lower-level system and hardware integration
  • Hiring, mentoring, and growing a team of infrastructure software engineers; building a high-performing organization from a strong foundation
  • Partnering with infrastructure, production engineering, and security teams to align platform capabilities with operational reliability, capacity, and trust requirements
  • Improving platform efficiency and availability — characterizing bottlenecks, reducing stranded resources, and shortening operational and recovery cycles
  • Establishing engineering standards for infrastructure software development: code quality, testing, deployment safety, and on-call practices for systems that span the platform
  • Translating a vertically integrated infrastructure stack into reliable platform primitives that engineering teams can build on
  • Staying technically hands-on — reviewing designs, contributing to architecture decisions, and being credible to the engineers you lead

What You'll Bring to the Team

  • 10+ years of experience in infrastructure or systems software development, with at least 3+ years in an engineering leadership role
  • Deep expertise in large-scale infrastructure platforms — building services that pool, allocate, and reconcile compute resources at scale
  • Strong background with Kubernetes and cloud platforms (GCP, AWS, or Azure) — orchestration, automation, and operating distributed systems in production
  • Experience with distributed state management and control systems — modeling resource and system lifecycle, reconciling desired vs. actual state, and handling failure gracefully across a large fleet
  • Experience with efficiency, capacity, or performance engineering — characterizing system behavior, identifying bottlenecks, and driving measurable improvements in utilization or availability
  • A player-coach approach to management: hands-on enough to make technical calls, structured enough to grow a team and ship through them
  • Track record of hiring strong infrastructure engineers and helping them grow into more senior roles
  • Comfortable operating in a fast-moving environment where the path isn't fully paved — willing to drive ambiguity to clarity

Bonus Points

  • Experience operating Kubernetes on bare-metal infrastructure as well as on managed cloud services (GKE, EKS, AKS)
  • Familiarity with the operational challenges of GPU clusters, AI training, and inference workloads
  • Working knowledge of platform security and trust concepts — secure boot, measured boot, TPMs, and hardware attestation
  • Experience with capacity forecasting, demand modeling, or allocation optimization at scale
  • Hands-on background with telemetry and observability platforms at scale (Prometheus, OpenTelemetry, Grafana)
  • Prior experience building infrastructure platforms at hyperscalers or cloud providers where internal engineers are the primary customer
  • Familiarity with hardware-software co-design — understanding how platform choices affect physical infrastructure utilization

Benefits

  • Competitive compensation and equity packages
  • Restricted Stock Units
  • Paid time off, paid holidays & leave of absence programs
  • Comprehensive health, dental & vision insurance
  • Employer contributions to HSA account
  • Paid parental leave
  • Paid life insurance, short-term and long-term disability
  • Professional development & tuition reimbursement
  • Mental health & wellness support
  • Commuter benefits (parking & transit)
  • Cell phone stipend
  • 401(k) Retirement plan with company match up to 4% of salary
  • Volunteer time off
  • Global travel insurance & emergency assistance
  • Daily meals allowance
  • Additional perks & programs specific to location

Skills

KubernetesGCPAWSAzureInfrastructureSystems SoftwareDistributed SystemsCapacity PlanningPerformance EngineeringPlatform SecurityObservabilityPrometheusOpenTelemetryGrafanaBare Metal

Senior Engineering Manager, Managed Platform Services

Lead the Command Center Insights & Actions team building observability, alerting, and automated remediation systems for Crusoe's AI cloud infrastructure. Own roadmap, mentor engineers, and drive technical excellence in a high-scale environment.

245k – 295kSan Francisco, CA +1Engineering ManagementOn-site7+ YOETelemetryHeuristics

Sr. Manager, Engineering, Ad Formats

This role is for a Senior Manager of Engineering to lead the Ads Format team, focusing on building the next generation server-driven UI framework for Ads Creation and Personalization. The role involves managing a team of 15 engineers and making key architectural decisions.

242k – 430kSan Francisco, CA +1Engineering ManagementHybrid9+ YOEAIStatistics

Engineering Manager, Data Platform

Engineering Manager leading teams to build scalable data infrastructure processing petabytes of data for Discord's gaming platform. Requires 7+ years software engineering experience in distributed systems/data infra, 2+ years leadership, and data tools expertise.

248k – 279kSan Francisco, CAEngineering ManagementOn-site7+ YOESQLAWS

Engineering Manager, Enterprise

Lead enterprise engineering end-to-end, scaling revenue 10x while building security/compliance features and hiring/coaching a world-class team. Requires proven experience transitioning SMB teams to enterprise and strong engineering judgment.

250k – 375kSan Francisco, CAEngineering ManagementOn-site7+ YOESSOSAML

Engineering Manager, Core Engineering

Lead engineering for fraud detection, identity verification, and compliance systems at a Series C AI infrastructure company. Build AI-powered risk and decision systems while managing a high-performing backend team.

250k – 400kSan Francisco, CA +1Engineering ManagementOn-site8+ YOEAi SystemsRisk Scoring