# Senior Software Engineer, Site Reliability
**Company:** [Upstart](https://hotfix.jobs/companies/upstart)
**Location:** Remote
**Salary:** $167K-$231K
**Experience:** 10+ years
**Skills:** Ruby on Rails, React, AWS, Docker, GitHub Actions, Distributed Systems, Service-Oriented Architecture, CI/CD, Infrastructure As Code, A/B Testing
**Posted:** 2026-03-19
> Senior SRE engineer builds tooling and automation to enhance production system reliability, monitoring microservices, Kubernetes, and ML platforms. Requires 6+ years in software/SRE/DevOps, proficiency in Python/Go, IaC, and observability tools.
## Job Description
## How you’ll make an impact
- Embody and share SRE principles at Upstart
- Exercise state-of-the-art SRE practices throughout the company
- Uphold a culture of visibility, ownership, and responsibility around service reliability
- Implement standards for monitoring **microservices**, **web apps**, **mobile apps**, **databases**, **Kubernetes** clusters, and **machine learning** platforms, in a fast-paced environment
- Improve incident response practices, both within SRE and throughout the company
- Automate away toil that make sense to be automated

## What we’re looking for

**Minimum requirements:**
- Minimum of **6 years** combined experience between **Software Engineering**, **Site Reliability**, and/or **DevOps Engineering** including **CI/CD**, **TDD**, internal tooling, **observability**, and other agile development practices
- Proficiency coding **Python**, **Go**, **JavaScript/TypeScript**
- Proficiency with **Infrastructure as Code** (**Terraform**, **CDK**, **Cloudformation**, etc.)
- Software engineering background with experience building internal tooling from scratch, and other agile development techniques
- Strong software design & architecture skills
- Fundamentally sound with **data structures** & **algorithms**
- Experience with **on-call** and **incident management** environments
- Experience with **observability**, **monitoring**, and reporting tools (e.g., **Datadog**, **Sumologic**, etc.)
- Experience supporting **SaaS** software in a **microservice**-oriented **cloud** environment
- Ability to work with multiple teams for enterprise-wide deliverables
- **Data**/**metrics**-driven mindset

**Preferred qualifications:**
- Experience with **service mesh**
- **Full Stack** development skills
- Experience building tooling for an **observability** platform
- Experience leveraging **LLM**/**GenAI** to improve SRE efficiency and processes
**Apply:** https://hotfix.jobs/jobs/senior-software-engineer-site-reliability-at-upstart-07a5ee85-8686-419b-9633-aa6f81d284a5
**Canonical:** https://hotfix.jobs/jobs/senior-software-engineer-site-reliability-at-upstart-07a5ee85-8686-419b-9633-aa6f81d284a5