Skip to content

Senior Software Engineer, Site Reliability

Senior SRE engineer builds tooling and automation to enhance production system reliability, monitoring microservices, Kubernetes, and ML platforms. Requires 6+ years in software/SRE/DevOps, proficiency in Python/Go, IaC, and observability tools.

167k – 231kUnited StatesDevOps / SRERemote10+ YOE

About the role

How you’ll make an impact

  • Embody and share SRE principles at Upstart
  • Exercise state-of-the-art SRE practices throughout the company
  • Uphold a culture of visibility, ownership, and responsibility around service reliability
  • Implement standards for monitoring microservices, web apps, mobile apps, databases, Kubernetes clusters, and machine learning platforms, in a fast-paced environment
  • Improve incident response practices, both within SRE and throughout the company
  • Automate away toil that make sense to be automated

What we’re looking for

Minimum requirements:

  • Minimum of 6 years combined experience between Software Engineering, Site Reliability, and/or DevOps Engineering including CI/CD, TDD, internal tooling, observability, and other agile development practices
  • Proficiency coding Python, Go, JavaScript/TypeScript
  • Proficiency with Infrastructure as Code (Terraform, CDK, Cloudformation, etc.)
  • Software engineering background with experience building internal tooling from scratch, and other agile development techniques
  • Strong software design & architecture skills
  • Fundamentally sound with data structures & algorithms
  • Experience with on-call and incident management environments
  • Experience with observability, monitoring, and reporting tools (e.g., Datadog, Sumologic, etc.)
  • Experience supporting SaaS software in a microservice-oriented cloud environment
  • Ability to work with multiple teams for enterprise-wide deliverables
  • Data/metrics-driven mindset

Preferred qualifications:

  • Experience with service mesh
  • Full Stack development skills
  • Experience building tooling for an observability platform
  • Experience leveraging LLM/GenAI to improve SRE efficiency and processes

Skills

Ruby on RailsReactAWSDockerGitHub ActionsDistributed SystemsService-Oriented ArchitectureCI/CDInfrastructure As CodeA/B Testing

Similar roles

DevOps / SRE jobs

Senior Site Reliability Engineer

Senior SRE monitors production infrastructure availability, capacity, and throughput at Fivetran. Collaborates with engineering on reliability practices, automation, and vulnerability remediation. Requires 5+ years SaaS experience, Kubernetes, cloud platforms, and scripting.

167k – 200kOakland, CADevOps / SREHybrid5+ YOEAWSGCP

Senior Software Engineer - Infrastructure and Tools

Build and extend scalable infrastructure for Databricks' data and AI platform, including multi-cloud systems and Kubernetes at massive scale. Requires 5+ years experience in Java/Scala/Go/C++/Python, distributed systems, and cloud technologies.

166k – 225kSan Francisco, CADevOps / SREHybrid5+ YOEGoC++

Software Engineer, Infrastructure

Owns and evolves Kubernetes-based infrastructure for secure, compliant AI deployments in financial services, including observability with Datadog, IaC with Terraform, and incident response. Requires 8+ years experience with Docker, K8s, AWS, and Python at scale.

168k – 213kSan Francisco, CADevOps / SREOn-site8+ YOEAWSHelm

Senior DevOps Engineer/Site Reliability Engineer

Seeking a Senior DevOps/Site Reliability Engineer to build, operate, and scale reliable cloud-native infrastructure and distributed data platforms. This role requires expertise in Kubernetes, cloud infrastructure, observability, automation, CI/CD, and incident management.

165k – 215kUnited StatesDevOps / SRERemote5+ YOEGoHelm

Sr. Platform Engineer I

Designs and builds developer tools, workflows, and CI/CD pipelines to boost engineering productivity across web, mobile, and desktop platforms. Requires 8+ years in platform engineering with expertise in Kubernetes, Docker, IaC, and cloud platforms.

165k – 180kChicago, IL +23DevOps / SREHybrid8+ YOEGoAWS