Skip to content

Senior Site Reliability Engineer, Core AI Infrastructure

Senior SRE owning reliability, monitoring, and automation for Coinbase's AI infrastructure on AWS and Kubernetes. Requires 5+ years cloud automation experience and strong incident response skills.

186k – 219kUnited StatesDevOps / SRERemote5+ YOE

About the role

Responsibilities

  • Own the reliability, monitoring, and incident response lifecycle for AI infrastructure services, including on-call support for AWS deployment pipelines, root cause analysis, and blameless retros.
  • Build automation and tooling to streamline operational IT workflows, eliminate manual tasks, and improve deployment velocity across CI/CD frameworks and Kubernetes environments.
  • Partner with the Coinbase Infrastructure team to extend CI/CD frameworks supporting IT services and enterprise network platforms, and with Security and Compliance to integrate surveillance tooling into deployment pipelines.
  • Strengthen observability and documentation standards across IT engineering by defining metrics, implementing monitoring solutions, and maintaining technical documentation that sets a standard of excellence.
  • Develop full-stack applications that power internal AI products and infrastructure with Go or Python.

Requirements

  • 5+ years of experience automating and supporting cloud infrastructure (AWS) and network environments, with hands-on use of infrastructure-as-code tools (Terraform, Ansible, Chef, Puppet, or Salt).
  • Proven experience deploying, managing, and troubleshooting containerized workloads using Docker and Kubernetes in production environments.
  • Proficiency in at least one scripting or programming language (Python, Bash, Ruby, or Go) and version control workflows using Git-based CI/CD pipelines.
  • Track record of leading incident response in environments with strict SLAs, including root cause analysis, blameless retros, and measurable reliability improvements.
  • Utilizes generative AI responsibly, maintaining human oversight to deliver business-ready outputs and drive measurable improvements in workflow efficiency, cost, and quality.

Nice to Haves

  • Expertise with linux, bash, ruby, python and/or go
  • Expertise automating EC2 or containers deployment with terraform
  • Strong network security fundamentals
  • Experience managing and leveraging log aggregation
  • Experience working in a highly regulated environment
  • Experience in a fast-paced, high-growth company
  • Experience in a Remote-first IT environment

Skills

AWSTerraformAnsibleChefPuppetSaltDockerKubernetesPythonBashRubyGoGitCI/CD

Similar roles

DevOps / SRE jobs

Senior Software Engineer, Infrastructure

Senior engineer building and standardizing AWS/GCP cloud infrastructure, networking, and self-service tooling for Coinbase's multi-cloud platform.

186k – 219kUnited StatesDevOps / SRERemote5+ YOEGoAWS

Senior Software Engineer, Infra - Compute Platform

Senior engineer owning Kubernetes-based compute orchestration platform. Builds tooling, automation, and AI-driven workflows to improve reliability and developer experience across Coinbase services.

186k – 219kUnited StatesDevOps / SRERemote5+ YOEAWSGCP

Senior Site Reliability Engineer, Identity Platform

Senior SRE owning reliability, automation, and DevOps for Coinbase's corporate IAM platform. Requires 5+ years SRE/infra experience with hands-on IAM ownership and cloud/IaC skills.

186k – 219kUnited StatesDevOps / SRERemote5+ YOEGoC#

Senior Software Engineer, Developer Infrastructure

Senior Software Engineer building and operating Coinbase's core developer infrastructure: CI/CD, build systems, deployment orchestration, and test infrastructure used by all Coinbase engineers.

186k – 219kUnited StatesDevOps / SRERemote5+ YOEGoAWS

Senior Software Engineer, Product Platform

As a Senior Software Engineer, Product Platform, you will design, build, and operate foundational systems that enable Block engineers to create, ship, and operate products with confidence and speed. You will work across platform domains, partner with engineering teams, and leverage AI tools to accelerate development.

185k – 327kNew York, NYDevOps / SREOn-site5+ YOEGoAI