Skip to content

Staff Software Engineer, Infrastructure

Designs, builds, and operates high-scale, low-latency production infrastructure services, owning SLOs and end-to-end reliability. Partners with teams to optimize performance, evolve CI/CD, and support diverse deployments; requires 8+ years experience with strong observability and cloud expertise.

300k – 430kSan Francisco, CADevOps / SREHybrid8+ YOE

About the role

Responsibilities

  • Design and implement critical infrastructure services with strong SLOs, clear runbooks, and actionable telemetry.
  • Partner with research and product teams to architect solutions, set up prototypes, evaluate performance, and scale new features.
  • Tune service latencies: optimize networking paths, apply smart caching/queuing, and tune CPU/memory/I/O for tight p95/p99s.
  • Evolve CI/CD, golden paths, and self-service tooling to improve developer velocity and safety.
  • Support various deployment architectures for customers with robust observability and upgrade paths.
  • Lead infrastructure-as-code (Terraform) and GitOps practices; reduce drift with reusable modules and policy-as-code.
  • Participate in on-call and drive down toil through automation and elimination of recurring issues.

Requirements

  • 8+ years building and operating production infrastructure at scale.
  • Depth in at least one area across Core/Data/AI-ML/Platform/Voice, with curiosity to learn the rest.
  • Proven track record meeting high availability and low latency targets (owning SLOs, p95/p99, and load testing).
  • Excellent observability chops (OpenTelemetry, Prometheus/Grafana, Datadog) and incident response (PagerDuty, SLO/error budgets).
  • Clear written communication and the ability to turn ambiguous requirements into simple, reliable designs.

Nice-to-Haves

  • Experience being an early backend/platform/infrastructure engineer at another company.
  • Strong Kubernetes experience (GKE/EKS/AKS) and experience across multiple cloud providers (GCP, AWS, Azure).
  • Experience with customer-managed deployments.

Compensation

  • $300K – $430K + equity

Skills

KubernetesTerraformGCPAWSAzureOpenTelemetryPrometheusGrafanaDatadogPagerdutyGKEEKSAksGitOps

Similar roles

DevOps / SRE jobs

Staff+ Software Engineer, Caching

Build and operate Anthropic's managed Redis caching layer and client libraries from the ground up. Drive technical direction for distributed caching infrastructure across multi-cloud environments with focus on consistency, performance, and developer experience.

320k – 485kSan Francisco, CA +2DevOps / SREHybrid10+ YOEGoC++

Staff Software Engineer, Infrastructure Asset Systems

As a Staff Software Engineer, you will build and extend systems for tracking, governing, and reporting on infrastructure assets. This involves designing data models, workflow engines, and integrations with financial and procurement systems, ensuring compliance and auditability.

320k – 405kSan Francisco, CA +1DevOps / SREHybrid7+ YOEGoSQL

Staff Fiber Network Engineer

Owns end-to-end physical layer of private global dark-fiber backbone network, including route design, fiber acquisition, vendor management, acceptance testing, and lifecycle management. Requires deep OSP/fiber expertise, optical transport knowledge, and 8+ years experience building fiber programs.

320k – 405kSan Francisco, CA +1DevOps / SREHybrid8+ YOEGoGis

Staff Engineer, Datacenter Server Lifecycle

Owns end-to-end server lifecycle in datacenters at scale, from provisioning to decommissioning, with strong focus on automation, trusted compute security, and hardware operations for AI workloads. Requires hands-on server hardware experience and proficiency in Python/Rust/Go plus cloud infra like Kubernetes/AWS/GCP.

320k – 405kSan Francisco, CA +1DevOps / SREHybrid8+ YOEGoAWS

Senior Staff Software Engineer, Infrastructure

Designs and implements large-scale public cloud infrastructure, builds complex distributed systems and microservices. Requires 10+ years experience, expert skills in performance tuning, concurrency, multiple cloud providers like AWS/GCP/Azure, and graduate degree or equivalent.

260k – 325kUnited StatesDevOps / SRERemote10+ YOEGoAWS