# Senior Site Reliability Engineer (SRE)
**Company:** [Tulip](https://hotfix.jobs/companies/tulip)
**Location:** Somerville, MA
**Experience:** 5+ years
**Skills:** Prometheus, OpenTelemetry, Kubernetes, Go, TypeScript, Promql, Lgtm Stack, Grafana, Loki, Tempo
**Posted:** 2026-02-25
> Senior SRE builds and maintains scalable infrastructure, mentors on observability best practices (SLIs/SLOs), handles incident response, and automates tools for engineering teams. Requires 5+ years with observability tools like Prometheus, OpenTelemetry, and Kubernetes.
## Job Description
## Key Responsibilities
- Mentor and evangelize on observability best practices, SLIs/SLOs, and reliability culture across engineering teams.
- Help architect our systems for growth and scale.
- Implement internal tools to automate common developer tasks.
- Perform incident response and debug production issues across the entire stack.
- Design, build, and maintain the core infrastructure used by all of Tulip’s engineering teams.
- Work to automate detection and resolution of recurring issues.

## Skills Required
- 5+ years of experience working with open source Observability tools (e.g. LGTM stack)
- Hands-on experience instrumenting distributed systems using **OpenTelemetry** and managing metrics pipelines with **Prometheus** at scale.
- Experience working with time-series data, ideally using **promQL**
- Ability to pick up new languages/frameworks with ease. Currently run **Go** and **Typescript** services on **Kubernetes**.

## About You
- Experience building and maintaining stable infrastructure at scale.
- Can reason about systems — their edge cases, failure modes, and life cycles.
- Excited about setting the technical agenda and coming up with novel, broad ideas.
- Can debug complex issues across the entire stack.
- Opinionated about the tools and frameworks that work best.
- Enjoys building for other engineers equally, if not more, than building for a customer.
- Knows what a good SLA looks like, and can teach others how to spot one.
- Can communicate as well as you can code. Understands the value of discussion and work best in a team that champions clear and frequent communication.
**Apply:** https://hotfix.jobs/jobs/senior-site-reliability-engineer-sre-at-tulip-5b33f3b7-231d-4851-bab5-c8e42efd3ff2
**Canonical:** https://hotfix.jobs/jobs/senior-site-reliability-engineer-sre-at-tulip-5b33f3b7-231d-4851-bab5-c8e42efd3ff2