Skip to content

Senior SRE/DevOps Engineer

United StatesRemote5+ YOE
Summary

Senior SRE/DevOps Engineer owns and operates AWS infrastructure and Kubernetes-based application stacks for Metabase Cloud, debugs issues, builds automation tooling, and improves deployments. Requires 5+ years experience with strong Kubernetes, AWS, Terraform, and modern languages like Python/Go.

About the role

Responsibilities

  • Own and operate our application stack and AWS infrastructure to orchestrate and manage our hosted customer instances of Metabase
  • Debug runtime issues across the different levels of our application stack and hosting stack
  • Develop and build our internal tooling and automation to manage the lifecycle of a hosted Metabase installation, from purchase to deployment, zero-downtime upgrades, and general operational health
  • Continuously improve our automated deployments and testing

Requirements

  • Thoughtful and careful
  • Compulsively automates everything and documents it
  • Able to make solid technical judgements and back them up articulately
  • At least 5 years of experience building and operating production infrastructure, ideally on public cloud
  • Strong Kubernetes and AWS experience
  • Strong experience with IaC and Terraform
  • Can write high quality and readable code in a modern language (e.g. Python, Go)
  • Experience with modern monitoring stacks (e.g. Prometheus/Grafana/Datadog)

Projects

  • Multi-region hosting
  • Automate EKS cluster provisioning
  • Extend our CRDs and Operators
  • Improve the RDS sharding strategy for our multi-tenant platform
  • Unify and improve our CI/CD platforms
  • Collaborate with core application developers on changes to improve our application metrics, deployment speeds and CI integration
  • Maintain our SOC2 compliance and security posture
Skills
KubernetesAWSTerraformPythonGoPrometheusGrafanaDatadogEKSCI/CD