Staff Software Engineer - Cloud Infrastructure and Applications

182k – 229kSan Mateo, CADevOps / SREHybrid8+ YOEMar 17

Summary

Designs and implements scalable cloud infrastructure for healthcare AI platform using Kubernetes, Terraform, and AWS/GCP. Owns DevOps pipelines, automation, reliability, and security with 8+ years experience.

About the role

What You’ll Do

Create, implement, and support DevOps strategies and continuous delivery pipelines with cross-functional agile teams
Own defining and implementing infrastructure, tools, and processes for continuous delivery of change; identify potential issues
Play a vital role in maturing continuous delivery processes for high availability and quality
Design and build infrastructure to support existing and upcoming products
Plan for infrastructure maintainability and foresee weaknesses
Identify new technologies to improve automation
Document infrastructure setup and best practices

What We’re Looking For

8+ years of relevant work experience
Hands-on SaaS delivery experience with AWS or GCP systems including incident response
Experience in automating build, test, package, release, and configuration management
Experience with Terraform
Good understanding of Linux/Unix fundamentals and debugging skills
Strong scripting skills (Bash, Python, NodeJS, Go)
Experience defining and deploying monitoring, metrics, and logging systems
Recent hands-on experience creating and managing containerized deployments (Kubernetes)
Demonstrable experience with networks, security, load balancers, DNS, etc
Rigor in high-code quality, automated testing, and other engineering best practices

Skills

TerraformKubernetesAWSGCPLinuxPythonBashGoNodeJSDevOpsCI/CDmonitoringnetworkingsecurity

Similar roles at this salary range

All DevOps / SRE jobs →

Plaid

Jun 19

Staff Site Reliability Engineer, Release Engineering

Staff SRE on the Release Engineering team defining and scaling reliability practices, architecting SLO/error-budget programs, and driving progressive delivery and automated safety gates across product engineering.

208k – 274kNew York, NYDevOps / SREHybrid8+ YOEGoSLO

Fivetran

Jun 18

Senior Site Reliability Engineer

Senior SRE responsible for production infrastructure reliability, incident response, deployment automation, and scaling SaaS systems on Kubernetes and major cloud platforms.

175k – 210kOakland, CADevOps / SREHybrid5+ YOEAWSGCP

Dropbox

Jun 18

Senior Infrastructure Software Engineer, Storage Core

Senior engineer building and operating Dropbox's exabyte-scale distributed storage systems. Focus on replication, erasure coding, performance, and reliability in Go/Rust.

180k – 274kUnited StatesDevOps / SRERemote9+ YOEGoC++

Okta

Jun 17

Staff Site Reliability Engineer - Observability

Staff SRE focused on building and scaling a comprehensive observability platform on GCP using Terraform, Splunk, and Grafana. Requires 5+ years GCP observability experience and strong coding skills in Python or Go.

194k – 267kBellevue, WA +4DevOps / SREHybrid5+ YOEGoGKE

Cribl

Jun 17

Sr Software Engineer, Storage

Senior Software Engineer on the Storage team building autoscaling, self-healing infrastructure-as-code systems that manage petabyte-scale telemetry storage on AWS.

175k – 205kUnited StatesDevOps / SRERemote5+ YOEGoS3

Apply