Skip to content

Staff Software Engineer, Platform Infrastructure

215k – 250kNew York, NYSan Francisco, CADevOps / SREHybrid
Summary

Staff engineer builds and owns scalable multi-cloud platform infrastructure for Astronomer's DataOps products. Requires deep Kubernetes/Go expertise, distributed systems knowledge, and multi-cloud experience to ensure reliability at enterprise scale.

About the role

Responsibilities

  • Own and develop platform infrastructure strategy, map out needs, make decisions, and own outcomes.
  • Decide what to work on and how, make promises, and deliver.
  • Conduct principled build vs. buy assessments and advocate for appropriate tools.
  • Create and maintain comprehensive internal documentation and decision records.
  • Participate in architectural forums and make open, principled decisions.

Requirements

  • Depth in distributed systems, understanding failure modes, consistency/availability tradeoffs, backpressure, and graceful degradation.
  • Kubernetes expertise at operator level, including scheduler and control loop under load.
  • Strong proficiency in Go for building production systems.
  • Multi-cloud experience (AWS, GCP, Azure) with architectural decisions in production.
  • Experience defining requirements and driving technology choices across engineering organization.
  • Strong written and verbal communication for design docs, postmortems, and global teams.

Nice-to-Haves

  • Experience with storage primitives (relational vs. object stores).
  • Work on SaaS/PaaS products across multiple clouds.
  • Familiarity with Apache Airflow or workflow orchestration.

Compensation

  • Estimated salary: $215,000 - $250,000 based on leveling and geography, plus equity and comprehensive benefits.
Skills
KubernetesGoDistributed SystemsMulti-cloudAWSGCPAzureApache Airflow
Similar roles at this salary range
All DevOps / SRE jobs →
Plaid

Staff Site Reliability Engineer, Release Engineering

Staff SRE on the Release Engineering team defining and scaling reliability practices, architecting SLO/error-budget programs, and driving progressive delivery and automated safety gates across product engineering.

208k – 274kNew York, NYDevOps / SREHybrid8+ YOEGoSLO
Fivetran

Senior Site Reliability Engineer

Senior SRE responsible for production infrastructure reliability, incident response, deployment automation, and scaling SaaS systems on Kubernetes and major cloud platforms.

175k – 210kOakland, CADevOps / SREHybrid5+ YOEAWSGCP
Dropbox

Senior Infrastructure Software Engineer, Storage Core

Senior engineer building and operating Dropbox's exabyte-scale distributed storage systems. Focus on replication, erasure coding, performance, and reliability in Go/Rust.

180k – 274kUnited StatesDevOps / SRERemote9+ YOEGoC++
Okta

Staff Site Reliability Engineer - Observability

Staff SRE focused on building and scaling a comprehensive observability platform on GCP using Terraform, Splunk, and Grafana. Requires 5+ years GCP observability experience and strong coding skills in Python or Go.

194k – 267kBellevue, WA +4DevOps / SREHybrid5+ YOEGoGKE
Cribl

Sr Software Engineer, Storage

Senior Software Engineer on the Storage team building autoscaling, self-healing infrastructure-as-code systems that manage petabyte-scale telemetry storage on AWS.

175k – 205kUnited StatesDevOps / SRERemote5+ YOEGoS3