Skip to content

Software Engineer

230k – 270kSan Francisco, CAHybrid4+ YOE
Summary

Design, build, and operate large-scale infrastructure services and automation tooling. Requires 4 years of experience with distributed systems, Kubernetes, IaC, CI/CD, and cloud infrastructure.

About the role

Responsibilities

  • Design, develop, and deploy new infrastructure services and automation tools to support platform growth and new product initiatives
  • Manage and optimize existing infrastructure components (compute, storage, networking) across 50+ global regions
  • Lead and participate in incident management, conducting postmortems, root cause analyses, and implementing long-term improvements
  • Evaluate infrastructure decisions and capacity planning strategies to improve reliability, scalability, and performance
  • Collaborate across teams to drive reliability, security, and compliance throughout the software lifecycle

Requirements

  • Bachelor’s degree or foreign degree equivalent in Computer Science, or related field
  • 4 years of experience in Software Engineering related role or job offered
  • 2 years of experience designing and operating complex, large-scale distributed systems in production, including service discovery, load balancing, high availability, and disaster recovery across multi-region or multi-availability-zone deployments
  • 2 years of experience implementing Infrastructure as Code (IaC) using tools such as Terraform or Pulumi, including authoring reusable modules, performing code reviews, and executing change management with drift detection and automated policy checks
  • 2 years of experience administering Kubernetes in production, including cluster provisioning and upgrades, workload orchestration and autoscaling, Helm-based packaging, and network policy configuration
  • 2 years of experience building internal automation and platform tooling using script programming languages such as Python, Bash or Rush, including developing command-line tools or services that interact with cloud and Kubernetes APIs and implementing automated tests
  • 2 years of experience configuring and operating observability stacks, including metrics, logs, and distributed tracing (e.g., Datadog, OpenTelemetry, Sentry), defining SLIs/SLOs, and creating actionable alerts integrated with incident response tooling (e.g., PagerDuty or Incident.io)
  • 2 years of experience designing and maintaining CI/CD pipelines (e.g., GitHub Actions or BuildKite), including build, test, and deployment automation, artifact management, and progressive delivery strategies (blue/green or canary)
  • 2 years of experience engineering cloud infrastructure on at least one major cloud platform (AWS, GCP, or Azure), including compute, networking (VPC/VNet design, routing, load balancing, and peering), identity and access management, and object/block storage
  • 2 years of experience managing operational data stores and caches (e.g., PostgreSQL or MySQL; Redis; and a document or key-value store such as MongoDB or DynamoDB), including replication/backup configuration, schema or data modeling, and performance tuning
  • 2 years of experience implementing network and platform security controls, including secrets management (e.g., CKMS, EKMS, CMEK), OS hardening and patching, least-privilege IAM policy design, and vulnerability remediation workflows with CI/CD gates

Nice-to-Haves

  • Experience with multi-region or multi-availability-zone deployments
  • Experience with progressive delivery strategies (blue/green or canary)
Skills
TerraformPulumiKubernetesPythonBashDatadogOpenTelemetrySentryGitHub ActionsBuildKiteAWSGCPAzurePostgreSQLMySQL
Similar roles at this salary range
All DevOps / SRE jobs →
Alembic

Senior Network & Site Reliability Engineer

Design, operate, and automate the global network and reliability layer for a high-performance NVIDIA DGX SuperPOD supporting ML workloads. Own architecture, observability, incident response, and security for mission-critical infrastructure.

210k – 240kSan Francisco, CADevOps / SREOn-site8+ YOEBGPVPN
Coinbase

Staff Software Engineer

Staff Software Engineer owning technical strategy and systems for Coinbase's test infrastructure at scale. Focus on fast, reliable test signals through orchestration, smart selection, sharding, and flakiness remediation.

218k – 257kUnited StatesDevOps / SRERemote10+ YOEGoAWS
Skydio

Staff Software Engineer - Infrastructure

Staff Infrastructure Engineer responsible for re-architecting Kubernetes infrastructure, improving continuous delivery, and making code changes across the stack to support drone platform needs.

230k – 275kSan Mateo, CADevOps / SREHybrid6+ YOEGoSaaS
F2

Staff Software Engineer, Infrastructure

Hands-on Infrastructure Tech Lead building and scaling AWS cloud infrastructure from scratch for an AI-driven enterprise analytics platform. Owns architecture, IaC, security/compliance (SOC 2), and operational excellence.

200k – 300kSan Francisco, CADevOps / SREHybrid7+ YOEAWSGCP
Coinbase

Senior Software Engineer, Infra - Compute Platform

Senior engineer owning Kubernetes-based compute orchestration platform. Builds tooling, automation, and AI-driven workflows to improve reliability and developer experience across Coinbase services.

186k – 219kUnited StatesDevOps / SRERemote5+ YOEAWSGCP