Staff Software Engineer, Platform Infrastructure

215k – 250kNew York, NYSan Francisco, CADevOps / SREHybridFeb 13

Summary

Staff engineer builds and owns scalable multi-cloud platform infrastructure for Astronomer's DataOps products. Requires deep Kubernetes/Go expertise, distributed systems knowledge, and multi-cloud experience to ensure reliability at enterprise scale.

About the role

Responsibilities

Own and develop platform infrastructure strategy, map out needs, make decisions, and own outcomes.
Decide what to work on and how, make promises, and deliver.
Conduct principled build vs. buy assessments and advocate for appropriate tools.
Create and maintain comprehensive internal documentation and decision records.
Participate in architectural forums and make open, principled decisions.

Requirements

Depth in distributed systems, understanding failure modes, consistency/availability tradeoffs, backpressure, and graceful degradation.
Kubernetes expertise at operator level, including scheduler and control loop under load.
Strong proficiency in Go for building production systems.
Multi-cloud experience (AWS, GCP, Azure) with architectural decisions in production.
Experience defining requirements and driving technology choices across engineering organization.
Strong written and verbal communication for design docs, postmortems, and global teams.

Nice-to-Haves

Experience with storage primitives (relational vs. object stores).
Work on SaaS/PaaS products across multiple clouds.
Familiarity with Apache Airflow or workflow orchestration.

Compensation

Estimated salary: $215,000 - $250,000 based on leveling and geography, plus equity and comprehensive benefits.

Skills

KubernetesGoDistributed SystemsMulti-cloudAWSGCPAzureApache Airflow

Similar roles at this salary range

All DevOps / SRE jobs →

Plaid

Jun 19

Staff Site Reliability Engineer, Release Engineering

Staff SRE on the Release Engineering team defining and scaling reliability practices, architecting SLO/error-budget programs, and driving progressive delivery and automated safety gates across product engineering.

208k – 274kNew York, NYDevOps / SREHybrid8+ YOEGoSLO

Fivetran

Jun 18

Senior Site Reliability Engineer

Senior SRE responsible for production infrastructure reliability, incident response, deployment automation, and scaling SaaS systems on Kubernetes and major cloud platforms.

175k – 210kOakland, CADevOps / SREHybrid5+ YOEAWSGCP

Dropbox

Jun 18

Senior Infrastructure Software Engineer, Storage Core

Senior engineer building and operating Dropbox's exabyte-scale distributed storage systems. Focus on replication, erasure coding, performance, and reliability in Go/Rust.

180k – 274kUnited StatesDevOps / SRERemote9+ YOEGoC++

Okta

Jun 17

Staff Site Reliability Engineer - Observability

Staff SRE focused on building and scaling a comprehensive observability platform on GCP using Terraform, Splunk, and Grafana. Requires 5+ years GCP observability experience and strong coding skills in Python or Go.

194k – 267kBellevue, WA +4DevOps / SREHybrid5+ YOEGoGKE

Cribl

Jun 17

Sr Software Engineer, Storage

Senior Software Engineer on the Storage team building autoscaling, self-healing infrastructure-as-code systems that manage petabyte-scale telemetry storage on AWS.

175k – 205kUnited StatesDevOps / SRERemote5+ YOEGoS3

Apply