Skip to content

Senior Software Engineer, DevProd (Infrastructure Observability)

Leads end-to-end development of scalable distributed systems for infrastructure observability, owns production issues, and collaborates on designs. Requires expertise in Go, Kubernetes, SQL, cloud providers, and observability tools like Clickhouse and Prometheus.

176k – 238kUnited StatesDevOps / SRERemote

About the role

What You'll Do

Build

  • Lead the end-to-end Software Development Lifecycle: goals & requirements solicitation, design & review, implementation, operationalization & deployment, support & maintenance.
  • Formulate feature designs, review with stakeholders, iterate to incorporate feedback and drive consensus.
  • Clearly document design choices and operational knowledge to successfully deploy and manage the software you develop.
  • Provide appropriate test and production readiness coverage for unit, integration, and performance of your feature ownership area.

Own

  • Set a high bar for technical excellence and take pride in the software you develop.
  • Design and build multi-component, distributed systems that operate at scale.
  • Investigate issues with a methodical approach to identify a root cause.
  • Understand performance and reliability implications of design options at scale. Make related tradeoffs.
  • Able to participate in the team’s on-call rotation.

Learn

  • Expert-level knowledge of architecture and services of assigned domain. Strong command over all aspects of the Temporal ecosystem.
  • Investigate and understand ways to best leverage Temporal’s own software to power our mission.
  • Deeply understand the needs of Temporal internal developers and external customers, and leverage that knowledge for product development and feature design.

Collaborate

  • Participate in design reviews and contribute to design of other features.
  • Share design principles for building reliable systems at scale.

What You'll Bring

  • User-first mindset. You’re excited by the opportunity to empower others through tooling, and enjoy deeply internalizing user goals and use cases to build effective solutions.
  • Motivated by impact. You are driven by a desire to make positive things happen.
  • Strong opinions about tools and technology that are equally balanced by a pragmatic drive for impact.
  • Ability to work in a self-directed manner in a fast-paced environment.
  • Excellent collaboration and communication skills.

Skills & Technologies

  • Demonstrated ability to develop horizontally scalable, resilient, and high performance distributed systems in a production environment.
  • Experience designing, implementing, deploying, and supporting large scale, geographically distributed observability and/or high throughput data streaming/processing pipelines, or similar.
  • Expert in one or more high-level programming languages, preferably Go.
  • Expert-level Kubernetes skills.
  • Expert-level query development skills, preferably SQL.
  • Hands-on experience with one or more cloud providers, preferably AWS, or GCP.
  • Thorough understanding of computer architecture, operating systems, and networking.
  • Familiarity with best practices regarding monitoring, instrumenting, and configuring infrastructure.

Compensation

  • The estimated pay range for this role is $176,000 - $237,600, depending on experience and location.
  • This role is eligible to participate in Temporal's equity plan.

Skills

GoKubernetesSQLAWSGCPClickHousePrometheusGrafanaLokiThanos

Similar roles

DevOps / SRE jobs

Senior Manager, Site Reliability Engineering - Infrastructure Platform

Leads Infrastructure Platform and Shared Services teams, overseeing Edge networking, Kubernetes platform, CI/CD, observability, and automation. Requires 6+ years technical leadership, AWS expertise, and strong Kubernetes/Terraform skills.

176k – 264kBellevue, WADevOps / SREHybrid6+ YOEAWSIac

Software Engineer, Platform (Developer Experience)

Technical leader building scalable developer experience tooling and release pipelines to enable fast, high-quality software delivery. Requires 7+ years software engineering experience with expertise in Docker, Kubernetes, and web frameworks like Node or Python.

176k – 265kUnited StatesDevOps / SRERemote7+ YOEDockerPython

Senior Software Engineer, Cloud Platform

Build and operate the cloud platform powering Zilliz Cloud and Vector Lakebase across multi-cloud environments, integrating control plane, scheduling, and database runtime for scalable AI workloads. Requires 3+ years building production systems, strong Kubernetes and cloud experience, and a bachelor's degree or equivalent.

175k – 225kRedwood City, CADevOps / SREHybrid3+ YOEAWSGCP

Sr Software Engineer, Storage

Senior Software Engineer on the Storage team building autoscaling, self-healing infrastructure-as-code systems that manage petabyte-scale telemetry storage on AWS.

175k – 205kUnited StatesDevOps / SRERemote5+ YOEGoS3

Senior Site Reliability Engineer Cloud Platform

Senior SRE focuses on ensuring reliability, availability, and performance of distributed database systems in cloud-native environments. Requires 4+ years experience with Kubernetes, Docker, cloud platforms (AWS/GCP/Azure), IaC tools, and scripting in Python/Go/Java.

175k – 225kRedwood City, CADevOps / SREHybrid4+ YOEGoAWS