Skip to content

Performance & Systems Engineer, Codex

295k – 445kSan Francisco, CAHybrid
Summary

Optimizes performance across Codex AI system's stack including LLM inference, cloud orchestration, and agent behavior to reduce latency and costs. Collaborates with researchers and engineers on high-impact improvements in a high-ownership role.

About the role

Responsibilities

  • Hunt down and address inefficiencies across the Codex system stack, from agent behavior to LLM inference to container orchestration, and beyond.
  • Build tooling to measure, profile, and optimize system performance at scale.
  • Collaborate with researchers and engineers to land high-ROI changes that improve latency and cost.

Requirements

  • Experience operating across both ML systems and cloud infrastructure.
  • Enjoy diving into messy, ambiguous problems and emerging with clear wins.
  • Think holistically about performance, balancing speed, cost, and user experience.
Skills
LLM inferencecloud orchestrationKubernetesperformance optimizationML systemscontainer orchestrationprofiling toolssystem monitoringlatency optimizationcost optimization
Similar roles at this salary range
All DevOps / SRE jobs →
Sentry

Staff Software Engineer, AI Developer Tooling

Own AI-assisted coding tooling at Sentry. Build harnesses, context systems, and API integrations so AI agents can operate across the full software development lifecycle.

240k – 320kSan Francisco, CADevOps / SREHybridCI/CDPython
Together AI

Staff Engineer, Distributed Storage and HPC & AI Infrastructure

Design and operate multi-petabyte distributed storage systems for large-scale AI training and inference, integrating parallel filesystems and building Kubernetes-native storage platforms.

250k – 300kSan Francisco, CADevOps / SREOn-siteGoCeph
Anthropic

Staff Software Engineer, Infrastructure Asset Systems

As a Staff Software Engineer, you will build and extend systems for tracking, governing, and reporting on infrastructure assets. This involves designing data models, workflow engines, and integrations with financial and procurement systems, ensuring compliance and auditability.

320k – 405kSan Francisco, CA +1DevOps / SREHybridGoSQL
Zoox

Staff Site Reliability Engineer

Zoox is seeking a Staff Site Reliability Engineer to lead source control, owning the technical strategy and roadmap for their Git-based monorepo. This role involves migrating from GitHub Enterprise to GitHub Cloud, building developer tooling, and partnering with various teams to enhance source control as a strategic asset.

250k – 300kFoster City, CADevOps / SREHybridBuckCI/CD
Crusoe

Senior Staff Network Engineer, Automation

Senior technical leader owning Crusoe's network automation platform, source of truth, intent-based config systems, and self-healing workflows across hyperscale multi-vendor fabrics. Requires 12+ years of production network automation experience with deep expertise in Python/Go, model-driven telemetry, and observability at 10K+ device scale.

245k – 295kSan Francisco, CADevOps / SREOn-siteGogNMI