Skip to content

Software Engineer, Platform

215k – 285kUnited StatesDevOps / SRERemote5+ YOE
Summary

Owns and scales platform infrastructure including edge/cloud services on Cloudflare, GCP, Vercel and data layers like Spanner, ClickHouse, Postgres to serve millions of LLM requests daily. Requires 5+ years in production infrastructure with cloud platforms, databases, and full-stack TypeScript expertise.

About the role

What You'll Do

  • Own and evolve our edge and cloud infrastructure across Cloudflare, Google Cloud, and Vercel.
  • Scale and operate our data layer including Spanner, ClickHouse, and Postgres.
  • Ensure we are optimizing for performance when serving LLM inference as traffic rapidly grows.
  • Partner with engineering leadership on capacity, reliability, and cost across the routing layer, with ownership of the systems carrying production traffic.
  • Set the bar and playbook for how we run infrastructure and operations as the team grows — tooling, observability, on-call, and the patterns other engineers build against.

About You

  • 5+ years building and operating production infrastructure at companies where uptime, latency, and cost matter.
  • Proven experience with cloud platforms (GCP, AWS, Azure) and edge-first serverless platforms (e.g. Cloudflare Workers).
  • Deep expertise in operating large scale databases (e.g Postgres, Spanner, etc).
  • A full-stack TypeScript shop won't faze you; you can move across the stack when the platform needs it.
  • High agency and a bias toward action. You don't wait for tickets — you see the bottleneck and fix it.
  • AI-forward in your workflow. You use coding agents, MCPs, and LLMs heavily and have opinions about what works.
  • Pragmatic about tradeoffs between speed and simplicity.

Bonus Points

  • Existing user of OpenRouter, or active side projects in AI products/infrastructure or developer tooling.

Compensation: Base salary $215,000 - $285,000 plus benefits & equity (US full-time). International compensation varies by local market.

Skills
Google CloudCloudflare WorkersSpannerClickHousePostgresTypeScriptGCPAWSAzureLLM inference
Similar roles at this salary range
All DevOps / SRE jobs →
Plaid

Staff Site Reliability Engineer, Release Engineering

Staff SRE on the Release Engineering team defining and scaling reliability practices, architecting SLO/error-budget programs, and driving progressive delivery and automated safety gates across product engineering.

208k – 274kNew York, NYDevOps / SREHybrid8+ YOEGoSLO
Fivetran

Senior Site Reliability Engineer

Senior SRE responsible for production infrastructure reliability, incident response, deployment automation, and scaling SaaS systems on Kubernetes and major cloud platforms.

175k – 210kOakland, CADevOps / SREHybrid5+ YOEAWSGCP
Dropbox

Senior Infrastructure Software Engineer, Storage Core

Senior engineer building and operating Dropbox's exabyte-scale distributed storage systems. Focus on replication, erasure coding, performance, and reliability in Go/Rust.

180k – 274kUnited StatesDevOps / SRERemote9+ YOEGoC++
Okta

Staff Site Reliability Engineer - Observability

Staff SRE focused on building and scaling a comprehensive observability platform on GCP using Terraform, Splunk, and Grafana. Requires 5+ years GCP observability experience and strong coding skills in Python or Go.

194k – 267kBellevue, WA +4DevOps / SREHybrid5+ YOEGoGKE
Cribl

Sr Software Engineer, Storage

Senior Software Engineer on the Storage team building autoscaling, self-healing infrastructure-as-code systems that manage petabyte-scale telemetry storage on AWS.

175k – 205kUnited StatesDevOps / SRERemote5+ YOEGoS3