Skip to content

Lead DevOps Engineer

Lead DevOps Engineer owning multi-cloud SaaS infrastructure, CI/CD automation, observability, and mentoring engineers. Requires 5-7+ years experience with Kubernetes, Terraform, cloud platforms, and production environments.

Somerville, MADevOps / SREHybrid5+ YOE

About the role

Key Responsibilities

  • Own the deployment, health, and continuous improvement of Tulip's multi-cloud, multi-region SaaS environments — including clusters spanning the US, Europe, and Asia
  • Design and evolve cloud architecture to ensure customer availability, stability, and performance as Tulip scales globally
  • Contribute to and help shape the infrastructure technical roadmap in partnership with engineering leadership
  • Own and continuously improve Tulip's CI/CD infrastructure, driving toward a fully automated, human-interaction-free software delivery lifecycle
  • Build automation tooling and internal systems that reduce operational toil and increase developer velocity
  • Define and maintain observability standards across Tulip's cloud environments, including metrics, alerting, logging, and distributed tracing
  • Proactively identify performance degradation and capacity risks before they impact customers; lead incident response and drive root cause analysis
  • Mentor and coach junior and mid-level engineers through code reviews, pairing sessions, and regular technical guidance
  • Serve as a close partner to application engineering teams throughout the SDLC, providing infrastructure guidance and support
  • Participate in the on-call rotation and help establish on-call best practices that scale as the team grows

Requirements

  • 5-7+ years of hands-on DevOps or Infrastructure Engineering experience, with demonstrated ownership of production cloud environments at scale
  • Proficiency with modern cloud infrastructure tooling — experience with Kubernetes, Helm, Terraform, Ansible, and major cloud providers (AWS and/or Azure)
  • Proven experience mentoring and coaching engineers — whether formally or informally
  • Experience managing enterprise-grade data persistence layers, including NoSQL and SQL databases, key/value stores, and messaging systems (e.g., AMQP, MQTT)
  • Familiarity with observability and monitoring tooling (e.g., Prometheus, Mimir, Thanos, Grafana) and a strong understanding of what good SRE practice looks like in a fast-growing SaaS environment
  • Comfort driving team rituals — sprint planning, standups, retrospectives — and contributing to a high-performing team culture
  • Exposure to modern programming or scripting languages used in infrastructure contexts (e.g., Go, TypeScript, Python, Bash)
  • Bachelor's degree in Computer Science, Engineering, or equivalent practical experience

Nice-to-Haves

  • Experience with multi-cloud, multi-region SaaS environments
  • Experience with CI/CD infrastructure automation

Benefits

  • Company equity
  • Competitive benefits package including Health, Dental, Vision, Short-term Disability, Long-term Disability, Life Insurance, AD&D Insurance, Flexible Spending Account (FSA), Commuter Benefits, Parental Leave, and 401(K)
  • Flexible work schedule and unlimited vacation policy
  • Virtual company events and happy hours
  • Fitness subsidies

Skills

KubernetesHelmTerraformAnsibleAWSAzurePrometheusGrafanaPythonGo

Similar roles

DevOps / SRE jobs

Software Engineer, Services Platform

Build platform primitives for service provisioning, deploy tooling, workflow orchestration, and service ownership at a fast-scaling AI coding tool company. Requires experience with durable workflows like Temporal, internal dev platforms, and strong focus on developer experience and reliability.

San Francisco, CA +1DevOps / SREOn-site5+ YOECI/CDTemporal

Software Engineer, Cloud Infrastructure

Build and operate AWS cloud and LLM infrastructure powering RAG, inference, and data pipelines for an aviation AI platform. Requires strong AWS depth, Python data pipelines, and production LLM experience.

135k – 260kSan Carlos, CADevOps / SREHybrid4+ YOEAWSVpc

Software Engineer, Traffic

Design, build, and operate scalable distributed systems and edge networks on AWS to handle Figma's growing customer traffic and services. Requires 4+ years building infrastructure at scale, experience with TypeScript or Go, and distributed/traffic systems.

153k – 376kSan Francisco, CA +1DevOps / SRERemote4+ YOEGoAWS

Cloud Engineer - Product Metrics

Design, build, and operate petabyte-scale distributed systems for product metrics using Golang, Kubernetes, and ClickHouse. Requires 5+ years building scalable systems and 2+ years with Golang.

141k – 230kUnited StatesDevOps / SRERemote5+ YOEGoAWS

Postgres Deployment Engineer

Own stability and deployment of PostgreSQL products. Package software with Nix, manage upgrades, optimize CI/CD, and resolve production issues. Requires 3+ years PostgreSQL experience and Nix proficiency.

United StatesDevOps / SRERemote3+ YOECGo