Skip to content

Senior, Site Reliability Engineer (SRE)

160k – 235kSan Francisco, CAMenlo Park, CADevOps / SREHybrid8+ YOE
Summary

Senior SRE designs, builds, and improves infrastructure, reliability, security, and observability for healthcare delivery platform. Requires 8+ years experience with cloud (AWS/GCP), Terraform, automation, and incident response in high-scale environments.

About the role

What you will do

  • Design, build, and improve the infrastructure that powers Sprinter’s patient care, clinician operations, internal tooling, and partner-facing systems
  • Improve reliability across distributed systems, cloud infrastructure, CI/CD, observability, and incident response
  • Raise the security baseline across cloud infrastructure, access controls, secrets management, identity, and operational workflows
  • Build and maintain infrastructure as code using Terraform and related tooling
  • Automate manual infrastructure and operational processes through scripting, tooling, and platform improvements
  • Partner with engineering teams to improve system architecture, deployment practices, monitoring, logging, and alerting
  • Troubleshoot complex issues across infrastructure, application, data, and operational boundaries
  • Help define reliability, security, and infrastructure standards that allow Sprinter to scale without creating brittle systems
  • Support incident response practices, postmortems, operational readiness, and continuous improvement across engineering
  • Make pragmatic tradeoffs between reliability, security, speed, and simplicity in a fast-moving startup environment

What you have done

  • Spent 8+ years in site reliability engineering, platform engineering, infrastructure engineering, security engineering, or related technical roles
  • Led high-impact infrastructure, reliability, platform, or security projects end to end with minimal oversight
  • Built and operated production systems in cloud environments, ideally AWS and/or GCP
  • Worked deeply with infrastructure as code, ideally Terraform
  • Improved observability, monitoring, logging, alerting, and incident response practices across engineering teams
  • Automated infrastructure, deployment, or operational workflows using scripting languages such as Python, Bash, or TypeScript
  • Improved cloud security, access management, secrets management, networking, or operational controls
  • Troubleshot production issues across application, infrastructure, networking, and deployment layers
  • Worked in environments where reliability, security, ambiguity, and speed all matter
  • Made technical decisions that balanced immediate business needs with long-term scalability, reliability, and maintainability

What gives you an edge

  • You’ve built or scaled infrastructure in health tech, logistics, marketplace, fintech, or other operationally complex environments
  • You’ve worked in mid- or growth-stage startups where speed, ambiguity, and pragmatic decision-making were required
  • You have experience improving security posture in a practical, engineering-friendly way
  • You’ve helped establish reliability standards, incident response practices, or platform patterns across an engineering org
  • You’re comfortable working directly with product engineers, data teams, operations, security stakeholders, and technical leadership
  • You have experience mentoring engineers and raising the operational bar across a broader engineering team
  • You’ve worked in regulated environments and understand the importance of privacy, security, and compliance best practices
  • You have people management experience or interest in growing into broader technical leadership over time

Our Technology Stack

  • Terraform and infrastructure-as-code tooling
  • AWS
  • GCP
  • TypeScript
  • Python
  • Bash
  • CI/CD systems
  • Monitoring, logging, and observability platforms
  • Identity, access, and secrets management systems
  • Cloud networking and infrastructure tooling
  • Container and deployment systems
  • Serverless AWS, including AppSync, DynamoDB, Lambda, Amplify, CloudFormation, and Node
  • GraphQL
  • React Native and React Native for Web
Skills
TerraformAWSGCPPythonTypeScriptBashCI/CDKubernetesGraphQLDynamoDBLambdaPrometheusGrafanaPagerDutyInfrastructure as Code
Similar roles at this salary range
All DevOps / SRE jobs →
Pindrop

Senior Manager, DevOps

Lead DevOps strategy and team to improve engineering velocity, platform reliability, and operational efficiency across multi-cloud (AWS/GCP) environments. Drive IaC, Kubernetes delivery, observability, AI-powered tooling adoption, and cross-functional collaboration.

155k – 185kUnited StatesDevOps / SRERemote6+ YOEGoAWS
Render

Software Engineer, Dev Velocity

Build internal developer platform, tooling, and automation to accelerate engineering velocity. Focus on CI/CD pipelines, test infrastructure, build systems, and metrics to help engineers ship faster and more reliably.

170k – 290kUnited StatesDevOps / SRERemote5+ YOEGoCI/CD
Okta

Senior Software Engineer, Observability

Senior engineer on the Auth0 Platform Observability team responsible for designing, building, and maintaining scalable observability infrastructure (metrics, logs, traces) using Datadog, Terraform, and OpenTelemetry.

147k – 202kBellevue, WA +3DevOps / SREHybrid5+ YOEAWSAzure
NMI

Senior MySQL Database Administrator

Senior DBA responsible for designing, maintaining, and improving MySQL database infrastructure in a high-availability SRE environment. Requires 5+ years MySQL/MariaDB experience and on-call participation.

130k – 160kUnited StatesDevOps / SRERemote5+ YOEMHAMySQL
Beacon AI

Software Engineer, Cloud Infrastructure

Build and operate AWS cloud and LLM infrastructure powering retrieval-augmented generation, vector search, and ML pipelines for aviation AI systems. Requires strong AWS depth, Python data pipelines, and production LLM experience.

135k – 260kSan Carlos, CADevOps / SREHybrid4+ YOES3AWS