Skip to content

Software Reliability Engineer

109k – 163kMountain View, CAOnsite
Summary

Build and operate resilient systems for Nuro's autonomous vehicle fleet. Design pipelines, automation, and tools to improve reliability and reduce operational toil. Join on-call rotation and lead investigations.

About the role

About the Work

  • Build fleet-scale pipelines that turn noisy onboard signals into actionable, high-confidence investigations.
  • Develop automated triage and correlation systems that deduplicate issues, route them to the right owning teams, and attach up-to-date priority signals and diagnostic context.
  • Partner with engineering teams and subject matter experts to turn investigation outcomes into better instrumentation, automation, and signal quality over time.
  • Build internal tools and workflows that reduce duplicate effort and increase situational awareness as the fleet scales (self-service debugging, standardized metrics, shared templates, securely scoped access).
  • Lead reliability investigations to identify contributing factors and ensure learnings turn into durable engineering changes.

About You

  • Experience writing and shipping software with an ownership mindset and attention to how it behaves in real-world conditions.
  • Ability to build and maintain tools and automation (Python, Go, Bash, C++).
  • Comfortable navigating systems remotely via SSH + CLI, and inspecting the state of a linux system and its services.
  • Interest in reliability engineering as a growth path: motivated to learn how to build distributed systems and the challenges of scaling them reliably.

This is a 12 month temporary full-time position with full benefits and potential for extension based on performance and business needs.

Skills
PythonGoBashC++LinuxSSHObservabilityAutomationDistributed SystemsReliability Engineering
Similar roles at this salary range
All DevOps / SRE jobs →
Kraken

Site Reliability Engineer - AI Agents

Design, build, and operate reliable infrastructure for AI agent workflows and model serving on AWS and Kubernetes. Build platform APIs, SDKs, and self-service tooling while ensuring observability and incident response for production AI systems.

96k – 192kUnited StatesDevOps / SRERemote5+ YOEAWSBash
Circle

Senior Site Reliability Engineer

Senior SRE responsible for incident response, infrastructure reliability, database operations, and scaling production systems on AWS and Kubernetes.

130k – 140kUnited StatesDevOps / SRERemote5+ YOEAWSMySQL
Nominal

Baremetal Infrastructure Engineer

Deploy and support Nominal's self-hosted platform in customer environments including air-gapped and regulated sites. Own Linux, Kubernetes, and bare-metal infrastructure reliability while partnering directly with customer IT and security teams.

120k – 230kNew York, NY +3DevOps / SREOn-site5+ YOESRECeph
Ai2

Senior Software Engineer, AI Infrastructure

Senior engineer building and operating large-scale HPC infrastructure for AI model training. Owns job scheduling, automation, and performance optimization across GPU clusters.

126k – 189kSeattle, WADevOps / SREOn-site8+ YOEGoSRE
Mozilla

Senior Site Reliability Engineer

Senior SRE to operate and evolve EKS Kubernetes platform, CI/CD pipelines, and observability stack for Thunderbird's open-source infrastructure. Requires 7+ years infrastructure experience and strong production Kubernetes and IaC skills.

123k – 144kUnited StatesDevOps / SRERemote7+ YOEAWSIAM