Skip to content

Staff Software Engineer - Cloud Infrastructure and Applications

182k – 229kSan Mateo, CADevOps / SREHybrid8+ YOE
Summary

Designs and implements scalable cloud infrastructure for healthcare AI platform using Kubernetes, Terraform, and AWS/GCP. Owns DevOps pipelines, automation, reliability, and security with 8+ years experience.

About the role

What You’ll Do

  • Create, implement, and support DevOps strategies and continuous delivery pipelines with cross-functional agile teams
  • Own defining and implementing infrastructure, tools, and processes for continuous delivery of change; identify potential issues
  • Play a vital role in maturing continuous delivery processes for high availability and quality
  • Design and build infrastructure to support existing and upcoming products
  • Plan for infrastructure maintainability and foresee weaknesses
  • Identify new technologies to improve automation
  • Document infrastructure setup and best practices

What We’re Looking For

  • 8+ years of relevant work experience
  • Hands-on SaaS delivery experience with AWS or GCP systems including incident response
  • Experience in automating build, test, package, release, and configuration management
  • Experience with Terraform
  • Good understanding of Linux/Unix fundamentals and debugging skills
  • Strong scripting skills (Bash, Python, NodeJS, Go)
  • Experience defining and deploying monitoring, metrics, and logging systems
  • Recent hands-on experience creating and managing containerized deployments (Kubernetes)
  • Demonstrable experience with networks, security, load balancers, DNS, etc
  • Rigor in high-code quality, automated testing, and other engineering best practices
Skills
TerraformKubernetesAWSGCPLinuxPythonBashGoNodeJSDevOpsCI/CDmonitoringnetworkingsecurity
Similar roles at this salary range
All DevOps / SRE jobs →
Plaid

Staff Site Reliability Engineer, Release Engineering

Staff SRE on the Release Engineering team defining and scaling reliability practices, architecting SLO/error-budget programs, and driving progressive delivery and automated safety gates across product engineering.

208k – 274kNew York, NYDevOps / SREHybrid8+ YOEGoSLO
Fivetran

Senior Site Reliability Engineer

Senior SRE responsible for production infrastructure reliability, incident response, deployment automation, and scaling SaaS systems on Kubernetes and major cloud platforms.

175k – 210kOakland, CADevOps / SREHybrid5+ YOEAWSGCP
Dropbox

Senior Infrastructure Software Engineer, Storage Core

Senior engineer building and operating Dropbox's exabyte-scale distributed storage systems. Focus on replication, erasure coding, performance, and reliability in Go/Rust.

180k – 274kUnited StatesDevOps / SRERemote9+ YOEGoC++
Okta

Staff Site Reliability Engineer - Observability

Staff SRE focused on building and scaling a comprehensive observability platform on GCP using Terraform, Splunk, and Grafana. Requires 5+ years GCP observability experience and strong coding skills in Python or Go.

194k – 267kBellevue, WA +4DevOps / SREHybrid5+ YOEGoGKE
Cribl

Sr Software Engineer, Storage

Senior Software Engineer on the Storage team building autoscaling, self-healing infrastructure-as-code systems that manage petabyte-scale telemetry storage on AWS.

175k – 205kUnited StatesDevOps / SRERemote5+ YOEGoS3