ChatGPT Performance Engineer

325k – 405kSan Francisco, CASeattle, WANew York, NYRemote7+ YOEApr 15

Summary

Performance Engineer optimizes infrastructure and application performance for ChatGPT and OpenAI API, focusing on latency, throughput, and efficiency at scale. Requires 7+ years in high-scale systems with expertise in profiling, tracing, and cross-layer optimizations.

About the role

Responsibilities

Analyze and optimize performance across application, middleware, runtime, and infrastructure layers—networking, storage, Python runtime, GPU utilization, and beyond.
Develop tooling and metrics that provide deep observability into system performance.
Collaborate closely with infra, platform, training, and product teams to identify key performance goals and drive systemic improvements.
Influence architecture and design decisions to prioritize latency, throughput, and efficiency at scale.
Lead investigations into high-impact performance regressions or scalability issues in production.
Drive performance testing strategies and help define SLAs/SLOs around latency and throughput for critical systems.

Requirements

7+ years of experience in software engineering with a strong track record in performance or reliability of high-scale distributed systems.
Deeply comfortable with performance profiling tools and tracing systems.
Experience optimizing performance across one or more layers of the stack (e.g., database, networking, storage, application runtime, GC tuning, Python/Golang internals, GPU utilization).
Strong understanding of OS internals, scheduling, memory management, and IO patterns.
Contributed to observability, benchmarking, or performance-focused infrastructure at scale.
Demonstrated success navigating ambiguity and aligning stakeholders around performance goals.
Value simplicity, rigor, and collaboration when solving complex systems problems.

Skills

PythonGolangPerformance ProfilingDistributed SystemsGPU UtilizationObservabilityTracing SystemsNetworkingStorage OptimizationOS Internals

Similar roles at this salary range

All DevOps / SRE jobs →

Anthropic

Jun 3

Staff Software Engineer, Infrastructure Asset Systems

As a Staff Software Engineer, you will build and extend systems for tracking, governing, and reporting on infrastructure assets. This involves designing data models, workflow engines, and integrations with financial and procurement systems, ensuring compliance and auditability.

320k – 405kSan Francisco, CA +1DevOps / SREHybridGoSQL

Zoox

May 22

Senior Manager, Network Engineering & Infrastructure

Lead and mentor a network engineering team responsible for designing, deploying, and operating multi-site enterprise network infrastructure across data centers, cloud, offices, and vehicle facilities. Requires 10+ years of network experience with 5+ years in senior leadership.

272k – 327kFoster City, CADevOps / SREHybridQoSCisco

Anthropic

May 20

Performance Engineer, Inference Systems

Performance engineer focused on cross-layer investigations of Anthropic's inference fleet for Claude, optimizing throughput, latency, reliability, and correctness while building observability and partnering with kernel and serving teams.

350k – 850kSan Francisco, CA +2DevOps / SREHybridSQLPython

OpenAI

May 16

Tech Lead, Deployment & Operations — Custom Infrastructure

Lead deployment and operations for OpenAI’s custom silicon and systems into data center environments. Drive hardware bring-up, validation, production deployment, and fleet reliability at scale while leading a technical team.

342k – 445kSan Francisco, CADevOps / SREHybridToolingAutomation

Anthropic

May 14

Staff Fiber Network Engineer

Owns end-to-end physical layer of private global dark-fiber backbone network, including route design, fiber acquisition, vendor management, acceptance testing, and lifecycle management. Requires deep OSP/fiber expertise, optical transport knowledge, and 8+ years experience building fiber programs.

320k – 405kSan Francisco, CA +1DevOps / SREHybridGoGIS

Apply