Staff Software Engineer - Cloud Infrastructure and Applications
182k – 229kSan Mateo, CADevOps / SREHybrid8+ YOE
Summary
Designs and implements scalable cloud infrastructure for healthcare AI platform using Kubernetes, Terraform, and AWS/GCP. Owns DevOps pipelines, automation, reliability, and security with 8+ years experience.
About the role
What You’ll Do
- Create, implement, and support DevOps strategies and continuous delivery pipelines with cross-functional agile teams
- Own defining and implementing infrastructure, tools, and processes for continuous delivery of change; identify potential issues
- Play a vital role in maturing continuous delivery processes for high availability and quality
- Design and build infrastructure to support existing and upcoming products
- Plan for infrastructure maintainability and foresee weaknesses
- Identify new technologies to improve automation
- Document infrastructure setup and best practices
What We’re Looking For
- 8+ years of relevant work experience
- Hands-on SaaS delivery experience with AWS or GCP systems including incident response
- Experience in automating build, test, package, release, and configuration management
- Experience with Terraform
- Good understanding of Linux/Unix fundamentals and debugging skills
- Strong scripting skills (Bash, Python, NodeJS, Go)
- Experience defining and deploying monitoring, metrics, and logging systems
- Recent hands-on experience creating and managing containerized deployments (Kubernetes)
- Demonstrable experience with networks, security, load balancers, DNS, etc
- Rigor in high-code quality, automated testing, and other engineering best practices
Skills
TerraformKubernetesAWSGCPLinuxPythonBashGoNodeJSDevOpsCI/CDmonitoringnetworkingsecurity
Similar roles at this salary range
All DevOps / SRE jobs →Staff Site Reliability Engineer, Release Engineering
Staff SRE on the Release Engineering team defining and scaling reliability practices, architecting SLO/error-budget programs, and driving progressive delivery and automated safety gates across product engineering.
208k – 274kNew York, NYDevOps / SREHybrid8+ YOEGoSLO
Staff Site Reliability Engineer - Observability
Staff SRE focused on building and scaling a comprehensive observability platform on GCP using Terraform, Splunk, and Grafana. Requires 5+ years GCP observability experience and strong coding skills in Python or Go.
194k – 267kBellevue, WA +4DevOps / SREHybrid5+ YOEGoGKE