Senior SRE/DevOps Engineer
United StatesRemote5+ YOE
Summary
Senior SRE/DevOps Engineer owns and operates AWS infrastructure and Kubernetes-based application stacks for Metabase Cloud, debugs issues, builds automation tooling, and improves deployments. Requires 5+ years experience with strong Kubernetes, AWS, Terraform, and modern languages like Python/Go.
About the role
Responsibilities
- Own and operate our application stack and AWS infrastructure to orchestrate and manage our hosted customer instances of Metabase
- Debug runtime issues across the different levels of our application stack and hosting stack
- Develop and build our internal tooling and automation to manage the lifecycle of a hosted Metabase installation, from purchase to deployment, zero-downtime upgrades, and general operational health
- Continuously improve our automated deployments and testing
Requirements
- Thoughtful and careful
- Compulsively automates everything and documents it
- Able to make solid technical judgements and back them up articulately
- At least 5 years of experience building and operating production infrastructure, ideally on public cloud
- Strong Kubernetes and AWS experience
- Strong experience with IaC and Terraform
- Can write high quality and readable code in a modern language (e.g. Python, Go)
- Experience with modern monitoring stacks (e.g. Prometheus/Grafana/Datadog)
Projects
- Multi-region hosting
- Automate EKS cluster provisioning
- Extend our CRDs and Operators
- Improve the RDS sharding strategy for our multi-tenant platform
- Unify and improve our CI/CD platforms
- Collaborate with core application developers on changes to improve our application metrics, deployment speeds and CI integration
- Maintain our SOC2 compliance and security posture
Skills
KubernetesAWSTerraformPythonGoPrometheusGrafanaDatadogEKSCI/CD