Skip to content

Senior Platform Engineer

Senior Platform Engineer owns and evolves infrastructure for reliability, performance, and cost optimization at scale. Partners with engineers on debugging, observability (Prometheus, Grafana), deployment pipelines (Kubernetes, Terraform), and on-call incident response. Requires 5+ years experience including DevOps/SRE.

United StatesDevOps / SRERemote5+ YOE

About the role

What You'll Do

  • Partner closely with our engineers to debug production issues, improve performance, and design systems that scale reliably
  • Own and evolve Socket’s infrastructure, with a focus on reliability, performance, and cost as we scale
  • Help define and evolve SLIs and SLOs for new and existing systems, turning reliability into something that can be measured and improved
  • Debug, maintain, and improve our deployment pipeline, including addressing failures in production and driving meaningful improvements over time
  • Build and maintain observability across our systems (metrics, logs, traces) to support faster detection and resolution of issues
  • Participate in an on-call rotation and drive incident reviews with an emphasis on concrete follow-ups and system improvements

What You'll Bring

  • 5+ years of software development experience, including 1+ year in a DevOps or SRE role
  • Comfortable working on a distributed, cross-functional team where priorities shift and the problems change day to day
  • Experience scaling and operating production web applications, preferably in a TypeScript / NodeJS environment
  • Strong knowledge of relational databases, with Postgres preferred
  • Hands-on experience building and using observability systems (Prometheus/Mimir, OpenTelemetry, Grafana)
  • Experience with container orchestration (Docker, Kubernetes)
  • Practical experience managing infrastructure-as-code with Terraform
  • Experience running systems in a cloud environment, with GCP preferred
  • Experience building and maintaining CI/CD pipelines (e.g. GitHub Actions)

Skills

TypeScriptNode.jsPostgresPrometheusOpenTelemetryGrafanaDockerKubernetesTerraformGCPGitHub Actions

Similar roles

DevOps / SRE jobs

Senior Site Reliability Engineer

Senior Site Reliability Engineer building and operating highly reliable, scalable Kubernetes-based cloud services in Okta's Emerging Products Group. Lead incident response, define SLOs, develop automation in Go/Python/Terraform, improve observability, and mentor on reliability best practices.

San Francisco, CADevOps / SREHybrid5+ YOEGoAWS

Senior Software Engineer, Infrastructure

Senior engineer building and standardizing AWS/GCP cloud infrastructure, networking, and self-service tooling for Coinbase's multi-cloud platform.

186k – 219kUnited StatesDevOps / SRERemote5+ YOEGoAWS

Senior Software Engineer - Snowpark Container Service

Senior engineer to design, build, and lead development of Snowpark Container Services, a Kubernetes-based container compute platform. Requires 7+ years building large-scale distributed systems and strong coding skills in Java, C++, or Go.

200k – 288kBellevue, WADevOps / SREHybrid7+ YOEGoC++

Senior DevOps Engineer

Senior DevOps Engineer building and operating Kubernetes-based ephemeral environments and cloud infrastructure on AWS to improve developer productivity and platform reliability.

153k – 231kUnited StatesDevOps / SRERemote4+ YOEGoAWS

Senior Site Reliability Engineer - Government Cloud

Build and operate AWS GovCloud infrastructure for federal customers, owning IaC, container pipelines, compliance documentation, and operational tooling. Requires 5+ years AWS experience and FedRAMP familiarity.

210k – 220kUnited StatesDevOps / SRERemote5+ YOEAWSCdk