Skip to content

Senior Site Reliability Engineer

130k – 140kUnited StatesRemote5+ YOE
Summary

Senior SRE responsible for incident response, infrastructure reliability, database operations, and scaling production systems on AWS and Kubernetes.

About the role

Responsibilities

  • Act as a first responder for system incidents and outages
  • Own and evolve monitoring, alerting, and log management systems
  • Manage and optimize database infrastructure (MySQL, Postgres, Clickhouse, Redis)
  • Maintain and improve server infrastructure and deployment pipelines
  • Collaborate with engineering teams to build scalable, resilient systems
  • Contribute to internal SRE tooling and automation efforts

Requirements

  • Deep expertise with AWS and Kubernetes
  • 5+ years of experience in a Site Reliability, DevOps, or Infrastructure Engineering role
  • Proven experience scaling production systems in a high-growth environment
  • Practical experience using AI tools to improve engineering productivity
  • Experience scaling an early-stage product to 1M+ monthly active users
  • Experience managing incident response and production system outages
  • Hands-on experience with database operations and optimization
  • Familiarity with observability tooling, monitoring, and logging best practices
  • Based in North or South America (AMER region)

Nice-to-Haves

  • Experience with SOC2 compliance or building secure infrastructure
  • Experience with Clickhouse or similar technologies

Compensation & Benefits

  • $130,000 - $140,000 USD per year
  • Fully remote
  • 35 days of PTO annually + paid sabbatical after 5 years
  • 100% medical coverage for you and family (or reimbursement)
  • Parental leave
  • Home office stipend
  • Learning & development stipend
  • Annual bonus potential
  • Company retreats twice a year
Skills
AWSKubernetesMySQLPostgreSQLClickHouseRedisMonitoringAlertingLog ManagementObservability
Similar roles at this salary range
All DevOps / SRE jobs →
Komodo Health

Senior Data Engineer, Sentinel (Pacific Time Zone)

Senior Infrastructure Engineer building and operating AWS cloud infrastructure for healthcare data platform. Requires Python, Terraform, CI/CD expertise, and big data tools experience.

153k – 210kUnited StatesDevOps / SRERemote5+ YOEAWSVPC
Pinterest

Sr. Production Engineer, Solutions Engineering

Senior Production Engineer building AI agents, platforms, and automation to ensure reliability of Pinterest's large-scale distributed systems serving hundreds of millions of users.

140k – 288kChicago, IL +1DevOps / SRERemote5+ YOEGoAWS
Nuro

Software Reliability Engineer

Build and operate resilient systems for Nuro's autonomous vehicle fleet. Design pipelines, automation, and tools to improve reliability and reduce operational toil. Join on-call rotation and lead investigations.

109k – 163kMountain View, CADevOps / SREOn-siteGoC++
Chime

Software Engineer, Infrastructure

Build and operate foundational data infrastructure including Airflow, Flink, DynamoDB, and RDS using Terraform and Kubernetes. Requires 2-4 years of infrastructure/platform experience and strong Python skills.

133k – 184kUnited StatesDevOps / SRERemote2+ YOEAWSRDS
Retool

Software Engineer, Developer Experience

Build internal AI tools and autonomous agents that embed into Retool's engineering workflows to boost developer productivity and reduce toil. Requires shipping real AI-powered developer tools and infrastructure.

155k – 315kSan Francisco, CADevOps / SREHybrid5+ YOELLMsAI agents