Staff Software Engineer (Technical Lead), Storage

204k – 255kUnited StatesRemote9+ YOEJun 15

Summary

Staff-level infrastructure engineer leading teams that build and operate Airbnb's critical KV stores, caching layers, coordination services, and data ingestion pipelines at massive scale.

About the role

Responsibilities

Own and operate a highly available, low-latency, distributed, multi-tenant KV store supporting millions of read QPS and 99.9+% availability.
Manage control planes and clients for ElasticCache clusters handling million+ IOPS and indexing QPS.
Operate a scalable, reliable, performant distributed coordination service supporting MySQL, Redis, Kafka, Flink, Druid, Zookeeper, and other systems.
Build and operate managed data export solutions including near real-time CDC and periodic mutation/full table snapshots.
Lead a team of developers to deliver multi-quarter cross-functional projects.
Stay current with data ingestion systems and evaluate/incorporate new technologies to improve architecture.
Influence team and organizational long-term roadmap and strategy.
Mentor and coach team members to enhance skills and technical standards.
Raise operational standards by proactively identifying, debugging, and fixing issues; participate in on-call rotation.

Requirements

9+ years of relevant industry experience.
Proven track record of leading and mentoring engineering teams, setting technical direction, and growing engineers.
Deep expertise in distributed systems, multi-tenant storage, and infrastructure; experience architecting and scaling high-performance, business-critical systems.
Demonstrated ability to collaborate and influence across teams, building alignment on technical strategy.
Strong judgment on technical trade-offs balancing short-term delivery with long-term maintainability.
Experience onboarding to and navigating complex codebases and enabling others to do the same.

Skills

Distributed SystemsKV StoresCachingElasticCacheRedisKafkaMySQLFlinkDruidZookeeperCDCData IngestionInfrastructureMulti-tenant Storage

Similar roles at this salary range

All DevOps / SRE jobs →

Fivetran

Jun 18

Senior Site Reliability Engineer

Senior SRE responsible for production infrastructure reliability, incident response, deployment automation, and scaling SaaS systems on Kubernetes and major cloud platforms.

175k – 210kOakland, CADevOps / SREHybrid5+ YOEAWSGCP

Dropbox

Jun 18

Senior Infrastructure Software Engineer, Storage Core

Senior engineer building and operating Dropbox's exabyte-scale distributed storage systems. Focus on replication, erasure coding, performance, and reliability in Go/Rust.

180k – 274kUnited StatesDevOps / SRERemote9+ YOEGoC++

Okta

Jun 17

Staff Site Reliability Engineer - Observability

Staff SRE focused on building and scaling a comprehensive observability platform on GCP using Terraform, Splunk, and Grafana. Requires 5+ years GCP observability experience and strong coding skills in Python or Go.

194k – 267kBellevue, WA +4DevOps / SREHybrid5+ YOEGoGKE

Cribl

Jun 17

Sr Software Engineer, Storage

Senior Software Engineer on the Storage team building autoscaling, self-healing infrastructure-as-code systems that manage petabyte-scale telemetry storage on AWS.

175k – 205kUnited StatesDevOps / SRERemote5+ YOEGoS3

Grow Therapy

Jun 16

Senior Platform Reliability Engineer

Senior Platform Reliability Engineer establishing reliability standards, observability, and incident response practices across engineering teams. Requires 6+ years operating production systems at scale with AWS, Kubernetes, Terraform, and modern observability tooling.

182k – 250kSan Francisco, CA +2DevOps / SREHybrid6+ YOEAWSEKS

Apply