Skip to content

Sr. Manager - Production Engineering

222k – 300kBellevue, WAMountain View, CASan Francisco, CARemote8+ YOE
Summary

Lead engineering team managing cloud IAM operations, CSP provisioning, compliance, and security data pipelines across AWS, Azure, GCP. Requires 8+ years in security/cloud engineering, 5+ years management, and BS in technical field.

About the role

The impact you will have

  • Build, lead, and grow a high-performing team of engineers responsible for cloud IAM operations, CSP environment management, security data pipelines, and compliance operations across AWS, Azure, and GCP.
  • Define and execute the strategy and roadmap for automating cloud access assignment, approval workflows, and IAM policy enforcement to reduce manual toil while strengthening security controls.
  • Own the end-to-end lifecycle of CSP account, subscription, and project provisioning, including secure onboarding of acquired companies' cloud environments into Databricks' organizations with minimal disruption.
  • Drive compliance programs including Cloud User Access Reviews, audit evidence collection, and IAM policy alignment to meet SOC2, FedRAMP, and other regulatory requirements.
  • Ensure the reliability and timeliness of security data pipelines that ingest CSP audit logging, enabling downstream detection and response capabilities.
  • Partner closely with Security, Security Engineering, and IT to interpret and operationalize security policies, ensuring consistent enforcement with high transparency and minimal friction for engineering teams.
  • Lead complex, multi-quarter initiatives spanning multiple teams and external partners, demonstrating leverage by executing through technical leads and developing future leaders on the team.
  • Lead GovCloud escort operations, including staffing a 24x7 on-call rotation for SEV0 incidents, and continuously improving operational resilience.

What we look for

  • 8+ years of experience in security engineering, cloud infrastructure, or production/site reliability engineering, with deep hands-on expertise in at least one major cloud provider (AWS, Azure, or GCP).
  • 5+ years of engineering management experience, including building teams, developing senior engineers and managers, and navigating complex people situations such as promotions and performance management.
  • Strong technical understanding of cloud IAM (policies, principals, roles, federation), CSP organizational structures, and identity governance frameworks.
  • Demonstrated success leading operational teams that balance high-throughput request execution with automation and process improvement.
  • Experience with compliance and audit workflows (SOC2, FedRAMP, ISO, or similar), including evidence collection and access review programs.
  • Track record of driving cross-functional initiatives that require influencing without direct authority across security, infrastructure, and engineering organizations.
  • Strong communication and stakeholder management skills, with the ability to translate security policy into practical operational processes that engineering teams can follow.
  • BS (or higher) in Computer Science, Information Security, or a related technical field.
Skills
AWSAzureGCPIAMKubernetesSOC2FedRAMPCloud SecurityIdentity GovernanceSecurity Data Pipelines
Similar roles at this salary range
All DevOps / SRE jobs →
Crusoe

Staff Software Engineer, Developer Experience

Staff-level engineer building developer tools, infrastructure, and automation to accelerate Crusoe engineering productivity. Requires Go, Kubernetes, CI/CD, and strong DevOps/SRE experience.

209k – 253kSan Francisco, CA +1DevOps / SREOn-siteGoGit
Aurelian

Staff Infrastructure Engineer

Build infrastructure, observability, and developer tooling for a realtime AI platform serving 911 centers. Requires 6+ years infrastructure/platform/backend experience and comfort across the full stack.

180k – 240kSeattle, WADevOps / SREOn-siteLoggingClickHouse
Stuut

Lead Site Reliability Engineer

Lead SRE driving reliability strategy, infrastructure architecture, observability, and incident response for a B2B fintech platform on AWS and Kubernetes. Requires 7+ years building production-grade distributed systems.

200k – 275kSan Francisco, CADevOps / SREOn-siteAWSEKS
Crusoe

Staff Network Engineer, Operations

Staff-level network operations engineer responsible for production reliability, incident response, and operational excellence across Crusoe's global edge, backbone, data center, and GPU cluster networks supporting AI workloads.

195k – 235kSan Francisco, CADevOps / SREOn-siteBGPQoS
Gusto

Staff Software Engineer, AI Developer Tools

Staff-level engineer architecting AI-native developer tools and infrastructure to accelerate engineering velocity across Gusto. Requires 8+ years experience building production AI systems with deep expertise in LLMs, RAG, and multi-agent workflows.

180k – 245kDenver, CO +3DevOps / SREHybridRAGLLMs