Skip to content

Staff Software Engineer, Platform

Leads infrastructure projects, establishes platform standards, and mentors engineers on a small platform team. Requires 9+ years backend/infra experience, 3+ in platform/SRE/DevOps, with expertise in AWS, Kubernetes, and observability tools.

San Francisco, CADevOps / SREHybrid9+ YOE

About the role

What You'll Do

  • Live by and champion our values: #empathy, #execution, #humility, #curiosity.
  • Lead critical infrastructure projects and drive architectural decisions that will span across multiple teams ranging from problem definition/scoping to execution and see it through to successful and smooth delivery of impactful solutions.
  • Establish platform engineering standards and partner with product teams on reliability, performance, and architecture.
  • Be hands on in various parts of the system like AWS infrastructure, CI/CD pipelines, internal tooling and application layer.
  • Communicate platform priorities and trade-offs to engineering leadership and help them with building a prioritized roadmap.
  • Mentor other engineers who need technical guidance.

What You'll Bring

  • 9+ years of backend/infrastructure engineering, with 3+ years in platform/SRE/DevOps roles.
  • Experience managing large backlogs and making prioritization trade-offs.
  • Proven ability to stabilize unreliable systems (CI, infrastructure, observability).
  • Mentorship experience—you'll guide engineers who need technical leadership.
  • Strong communication skills to advocate for platform needs and explain trade-offs.
  • Comfort with ambiguity and autonomy—you'll define what "good" looks like.
  • Ability to balance tech debt paydown with new feature delivery.

Success Looks Like

  • Experience working with NodeJS, PostgreSQL, and Redis, or similar, at scale.
  • Strong debugging skills across application, infrastructure, and networking.
  • Strong knowledge of testing best practices. TDD is a bonus.
  • CI/CD systems (GitHub Actions or equivalent).
  • Experience working AWS Services and infrastructure-as-code (Terraform strongly preferred).
  • Kubernetes (deployments, scaling, troubleshooting, security).
  • Experience using a distributed messaging system.
  • Observability tooling (Datadog, Prometheus, Grafana, etc.).

What We Offer

  • Competitive equity with a 10-year exercise window.
  • Full medical, dental, and vision coverage (100% for employees, 85% for dependents).
  • Unlimited PTO (with a 3-week minimum).
  • 401(k) plan.
  • Regular team offsites and meetups.
  • Hybrid work: 2+ days/week in our SF Office.

Skills

Node.jsPostgresRedisAWSTerraformKubernetesCI/CDGitHub ActionsDatadogPrometheus

Similar roles

DevOps / SRE jobs

Staff Software Engineer, Cloud FinOps

Staff-level engineer driving company-wide cloud cost optimization and FinOps initiatives across engineering teams. Requires 5+ years infrastructure experience and 2+ years FinOps/cloud cost management.

180k – 240kUnited StatesDevOps / SRERemote5+ YOEAWSJava

Staff Software Engineer, Core Reliability

Staff engineer on the Infra Reliability team improving system resiliency, deployment safety, and configuration management for Coinbase's production environment at massive scale.

218k – 257kUnited StatesDevOps / SRERemote7+ YOEGoAWS

Staff+ Software Engineer, Caching

Build and operate Anthropic's managed Redis caching layer and client libraries from the ground up. Drive technical direction for distributed caching infrastructure across multi-cloud environments with focus on consistency, performance, and developer experience.

320k – 485kSan Francisco, CA +2DevOps / SREHybrid10+ YOEGoC++

Senior Staff Engineer, Platform R&D

Senior individual contributor embedded in Crusoe's Managed Platform Services team to accelerate delivery through rapid AI-augmented R&D, prototyping, and cross-domain technical leadership. Requires 10+ years experience with systems languages and cloud-native infrastructure.

245k – 295kSan Francisco, CADevOps / SREOn-site10+ YOEGoC++

Software Engineer, Developer Experience

Lead the rollout of Go as a fully supported, production-grade platform at Notion. Own service patterns, tooling, and guardrails while tackling high-leverage developer experience challenges across AI workflows, CI, and reliability.

New York, NY +1DevOps / SREHybrid10+ YOEGoCI/CD