Software Engineer, Platform

215k – 285kUnited StatesDevOps / SRERemote5+ YOEApr 23

Summary

Owns and scales platform infrastructure including edge/cloud services on Cloudflare, GCP, Vercel and data layers like Spanner, ClickHouse, Postgres to serve millions of LLM requests daily. Requires 5+ years in production infrastructure with cloud platforms, databases, and full-stack TypeScript expertise.

About the role

What You'll Do

Own and evolve our edge and cloud infrastructure across Cloudflare, Google Cloud, and Vercel.
Scale and operate our data layer including Spanner, ClickHouse, and Postgres.
Ensure we are optimizing for performance when serving LLM inference as traffic rapidly grows.
Partner with engineering leadership on capacity, reliability, and cost across the routing layer, with ownership of the systems carrying production traffic.
Set the bar and playbook for how we run infrastructure and operations as the team grows — tooling, observability, on-call, and the patterns other engineers build against.

About You

5+ years building and operating production infrastructure at companies where uptime, latency, and cost matter.
Proven experience with cloud platforms (GCP, AWS, Azure) and edge-first serverless platforms (e.g. Cloudflare Workers).
Deep expertise in operating large scale databases (e.g Postgres, Spanner, etc).
A full-stack TypeScript shop won't faze you; you can move across the stack when the platform needs it.
High agency and a bias toward action. You don't wait for tickets — you see the bottleneck and fix it.
AI-forward in your workflow. You use coding agents, MCPs, and LLMs heavily and have opinions about what works.
Pragmatic about tradeoffs between speed and simplicity.

Bonus Points

Existing user of OpenRouter, or active side projects in AI products/infrastructure or developer tooling.

Compensation: Base salary $215,000 - $285,000 plus benefits & equity (US full-time). International compensation varies by local market.

Skills

Google CloudCloudflare WorkersSpannerClickHousePostgresTypeScriptGCPAWSAzureLLM inference

Similar roles at this salary range

All DevOps / SRE jobs →

Plaid

Jun 19

Staff Site Reliability Engineer, Release Engineering

Staff SRE on the Release Engineering team defining and scaling reliability practices, architecting SLO/error-budget programs, and driving progressive delivery and automated safety gates across product engineering.

208k – 274kNew York, NYDevOps / SREHybrid8+ YOEGoSLO

Fivetran

Jun 18

Senior Site Reliability Engineer

Senior SRE responsible for production infrastructure reliability, incident response, deployment automation, and scaling SaaS systems on Kubernetes and major cloud platforms.

175k – 210kOakland, CADevOps / SREHybrid5+ YOEAWSGCP

Dropbox

Jun 18

Senior Infrastructure Software Engineer, Storage Core

Senior engineer building and operating Dropbox's exabyte-scale distributed storage systems. Focus on replication, erasure coding, performance, and reliability in Go/Rust.

180k – 274kUnited StatesDevOps / SRERemote9+ YOEGoC++

Okta

Jun 17

Staff Site Reliability Engineer - Observability

Staff SRE focused on building and scaling a comprehensive observability platform on GCP using Terraform, Splunk, and Grafana. Requires 5+ years GCP observability experience and strong coding skills in Python or Go.

194k – 267kBellevue, WA +4DevOps / SREHybrid5+ YOEGoGKE

Cribl

Jun 17

Sr Software Engineer, Storage

Senior Software Engineer on the Storage team building autoscaling, self-healing infrastructure-as-code systems that manage petabyte-scale telemetry storage on AWS.

175k – 205kUnited StatesDevOps / SRERemote5+ YOEGoS3

Apply