Skip to content

Staff Software Engineer, Core Reliability

Staff engineer on the Infra Reliability team improving system resiliency, deployment safety, and configuration management for Coinbase's production environment at massive scale.

218k – 257kUnited StatesDevOps / SRERemote7+ YOE

About the role

What you’ll do

  • Build and launch reliability projects and features that improve resiliency across our service environment in partnership with reliability teams.
  • Work closely with critical T0/T1 services to understand architecture, improve scalability and reliability, and reduce operational toil.
  • Build and enhance systems that securely manage service configurations and secrets at scale.
  • Improve canary-based release systems to make deployments safer and reduce incidents.
  • Expand deployment capabilities to support thousands of services and hundreds of daily deployments.
  • Partner across teams to promote reliability best practices and strengthen reliability culture across Coinbase.

Required Skills and Experience

  • 7+ years of software engineering experience.
  • Experience designing, building, scaling, and maintaining production services in service-oriented architectures.
  • Strong system design and coding skills, with a track record of writing high-quality, well-tested code.
  • Strong observability, debugging, and performance tuning skills.
  • Excellent written and verbal communication skills, with the ability to explain technical concepts clearly.
  • Sound judgment under pressure and a willingness to debug and improve any layer of the stack.
  • Ability to participate in an on-call rotation and respond to issues outside normal business hours.
  • Experience building reliable, high-throughput, low-latency systems.
  • Experience with observability tools such as Kibana and Datadog.
  • Familiarity with rapid-growth environments.
  • Experience with Ruby, Go, Terraform, and cloud platforms such as AWS, GCP, or Azure.
  • Utilizes generative AI responsibly, maintaining human oversight to deliver business-ready outputs and drive measurable improvements in workflow efficiency, cost, and quality.

Skills

RubyGoTerraformAWSGCPAzureKibanaDatadogSystem DesignObservability

Similar roles

DevOps / SRE jobs

Staff Software Engineer

Staff Software Engineer owning technical strategy and systems for Coinbase's test infrastructure at scale. Focus on fast, reliable test signals through orchestration, smart selection, sharding, and flakiness remediation.

218k – 257kUnited StatesDevOps / SRERemote10+ YOEGoAWS

Staff Site Reliability Engineer

Staff SRE on the IT Operations team owning reliability, automation, and observability for Coinbase's AI infrastructure on AWS and Kubernetes. Requires 8+ years of cloud infrastructure experience and strong incident response leadership.

218k – 257kUnited StatesDevOps / SRERemote8+ YOEGoAWS

Staff Site Reliability Engineer

Leads infrastructure transformation from monoliths to scalable microservices at massive scale, architects observability/CI/CD systems, unifies complex stacks, and mentors engineers. Requires 10+ years coding internal tools, 5+ years cloud (GCP/AWS), Bachelor's in CS.

218k – 260kMountain View, CADevOps / SREOn-site10+ YOEGCPAWS

Staff Infrastructure Software Engineer, Enterprise AI

Builds and scales multi-cloud infrastructure for enterprise AI Agentic workflows, focusing on security, compliance, observability, and developer tools. Requires 5+ years experience with modern infra practices, cloud providers, and languages like Python.

216k – 270kNew York, NY +1DevOps / SREHybrid5+ YOEAWSGCP

Member of Technical Staff

This role is for a Software Engineer on the Cloud Infrastructure team, focusing on designing, building, and operating foundational cloud primitives and deployment models. The engineer will own the roadmap and technical strategy for agent-driven cloud infrastructure management, ensuring secure and scalable solutions for various customer environments.

220k – 405kSan Francisco, CA +2DevOps / SREOn-site7+ YOEGoAWS