Skip to content

Senior Software Engineer, Observability

147k – 202kBellevue, WAChicago, ILNew York, NYWashington, DCDevOps / SREHybrid5+ YOE
Summary

Senior engineer on the Auth0 Platform Observability team responsible for designing, building, and maintaining scalable observability infrastructure (metrics, logs, traces) using Datadog, Terraform, and OpenTelemetry.

About the role

Responsibilities

  • Champion observability best practices, acting as an educator who can effectively correct anti-patterns and teach other engineering teams how to build robust, standardized instrumentation.
  • Be an expert in running services in production environments.
  • Contribute to the process of designing services for high growth and high availability.
  • Provision, configure, and monitor cloud-native infrastructure and services.
  • Design, build, and maintain scalable observability infrastructure using tools like Terraform.
  • Troubleshoot performance issues and operational issues.
  • Automate operational tasks and improve scripts.
  • Assist with and provide feedback for performance testing and automation.
  • Actively participate in major incident response to diagnose root causes and identify critical gaps in current telemetry tooling.
  • Act as a technical leader, driving cross-team initiatives to improve instrumentation and observability standards across the broader engineering organization.

Requirements

  • 5+ years of platform engineering, SRE, or DevOps experience.
  • Experience with cloud infrastructure like AWS, Google Cloud, or Azure.
  • Expertise in the Datadog ecosystem (Metrics, Logs, Traces, and Error Tracking), including establishing alerting standards, implementing tagging taxonomies, and managing Datadog configurations via Terraform.
  • Strong coding skills in Node.js or Golang.
  • Experience with containerization and orchestration tools (e.g., Docker, Kubernetes).
  • A data-driven approach to debugging complex, cross-service performance bottlenecks.
  • Deep understanding of microservice architecture and best practices.
  • Experience in coaching and mentoring more junior engineers.
  • Proven ability to lead cross-functional technical initiatives and collaborate seamlessly with multiple engineering teams.
  • Hands-on experience with OpenTelemetry (OTel), Vector, or similar frameworks for instrumenting applications.
Skills
AWSGoogle CloudAzureDatadogTerraformNode.jsGolangDockerKubernetesOpenTelemetry
Similar roles at this salary range
All DevOps / SRE jobs →
Pindrop

Senior Manager, DevOps

Lead DevOps strategy and team to improve engineering velocity, platform reliability, and operational efficiency across multi-cloud (AWS/GCP) environments. Drive IaC, Kubernetes delivery, observability, AI-powered tooling adoption, and cross-functional collaboration.

155k – 185kUnited StatesDevOps / SRERemote6+ YOEGoAWS
Render

Software Engineer, Dev Velocity

Build internal developer platform, tooling, and automation to accelerate engineering velocity. Focus on CI/CD pipelines, test infrastructure, build systems, and metrics to help engineers ship faster and more reliably.

170k – 290kUnited StatesDevOps / SRERemote5+ YOEGoCI/CD
NMI

Senior MySQL Database Administrator

Senior DBA responsible for designing, maintaining, and improving MySQL database infrastructure in a high-availability SRE environment. Requires 5+ years MySQL/MariaDB experience and on-call participation.

130k – 160kUnited StatesDevOps / SRERemote5+ YOEMHAMySQL
Beacon AI

Software Engineer, Cloud Infrastructure

Build and operate AWS cloud and LLM infrastructure powering retrieval-augmented generation, vector search, and ML pipelines for aviation AI systems. Requires strong AWS depth, Python data pipelines, and production LLM experience.

135k – 260kSan Carlos, CADevOps / SREHybrid4+ YOES3AWS
Faro Health

Senior DevOps Engineer

Senior DevOps Engineer managing CI/CD automation, infrastructure as code, and cloud-native deployments on Azure/AWS with Kubernetes, Terraform, and observability tooling. Requires 5+ years DevOps experience and a CS bachelor's or equivalent.

162k – 191kCaliforniaDevOps / SRERemote5+ YOEAWSHELM