Skip to content

Director II, Engineering

Leads Compute Platform & Unified Deployment team to architect and build secure, multi-tenant cloud-native compute infrastructure for Confluent Cloud. Requires 10+ years in hyper-scale distributed systems, Kubernetes expertise, and technical leadership to ensure reliability and scalability.

United StatesDevOps / SRERemote10+ YOE

About the role

What You’ll Do

  • Drive technical charter for Compute Platform & Unified Deployment (cloud infrastructure, big data, security & AI)
  • Build roadmap with product and engineering management to enable new business opportunities
  • Deliver high-impact initiatives in security, reliability, multi-tenancy, architecture, component refactor
  • Serve as technical leader and representative for engineering
  • Lead team and product on definition, design, and delivery

What We're Looking For

  • 10+ years delivering scalable software solutions
  • Proven track record leading large-scale, highly available, low latency systems
  • Deep hands-on expertise in hyper-scale distributed systems engineering
  • Deep expertise in multi-cluster/multi-tenant Kubernetes (big plus)
  • Deep expertise in safe deployment of mission-critical workloads (big plus)
  • Expertise in Language Runtimes (JVM, WASM, CPython) (big plus)
  • Expertise in Container & Virtualization/Hypervisor technologies (Firecracker, gVisor, cloud-hypervisor, Kata Containers)
  • Experience in large-scale safe deployment technologies & best practices
  • Experience in Cloud Native technologies including networking & security
  • Experience driving operational excellence for large production services
  • Track record of technical leadership and mentorship
  • Track record of cross-team collaboration

Skills

KubernetesCloud NativeDistributed SystemsContainersVirtualizationJvmWasmCpythonFirecrackerGvisorNetworkingSecurityKafkaFlink

Similar roles

DevOps / SRE jobs

Director of Site Reliability Engineering

Lead and develop a distributed SRE team, setting vision and operating model for reliability, infrastructure, and service ownership across engineering. Own core infrastructure services and drive operational maturity, incident response, and developer productivity.

210k – 310kSan Francisco, CADevOps / SREHybrid10+ YOESREAWS

Director of Site Reliability Engineering

Lead and develop a distributed SRE team, setting vision and operating model for reliability practices. Own core infrastructure services (Kubernetes, CI/CD, observability) and drive service ownership frameworks across engineering teams.

210k – 310kNew York, NYDevOps / SREHybrid10+ YOESREAWS

Director of Platform & Reliability Engineering

The Director of Platform & Reliability Engineering will lead an engineering organization responsible for secure, scalable, and highly reliable products. This role involves setting the vision for internal platforms, cloud infrastructure, developer enablement, and production operations.

235k – 245kSan Francisco, CADevOps / SREHybrid8+ YOECI/CDKubernetes

Director of SRE (FTE)

Leads SRE strategy, reliability, observability, incident management, and QA for cloud-native healthcare EMR platform. Oversees Azure AKS, Kubernetes, CI/CD, and vendor teams; requires 12+ years SRE experience with 5+ years leadership.

175k – 200kUnited StatesDevOps / SRERemote12+ YOESplunkPython

Platform Staff Engineer- Universal Directory

Staff engineer on Universal Directory Platform team builds scalable distributed systems for identity management using Java and cloud infrastructure. Requires 7+ years Java experience, expertise in Spring Boot/Hibernate, and 3+ years deploying services on AWS/GCP.

194k – 243kSan Francisco, CADevOps / SREHybrid7+ YOEGoAWS