Skip to content

Performance Modeling Engineer

266k – 445kSan Francisco, CASeattle, WAHybrid
Summary

Develops and maintains performance modeling tools and frameworks to evaluate AI system behavior, analyze tradeoffs in compute, memory, networking, and storage. Collaborates with architects on simulations and insights for infrastructure design; requires strong software/modeling background and system architecture knowledge.

About the role

Key Responsibilities

  • Develop and maintain performance modeling tools and frameworks.
  • Build models to evaluate system behavior across: compute, memory, and interconnect subsystems; distributed system scaling and bottlenecks.
  • Run simulations and analytical models to support architectural tradeoff analysis.
  • Collaborate with performance modeling lead and system architects to answer forward-looking design questions.
  • Analyze and interpret modeling outputs, translating results into actionable insights.
  • Validate models against real system measurements and workload behavior.
  • Contribute to improving modeling fidelity, usability, and scalability.

Qualifications

  • Strong software engineering or modeling background (e.g., simulation, systems modeling, or performance analysis).
  • Familiarity with system architecture fundamentals (compute, memory, networking).
  • Experience with programming and building technical tools or frameworks.
  • Ability to reason about performance bottlenecks and scaling behavior.
  • Strong analytical skills and comfort working with quantitative models.
  • Ability to collaborate across teams and learn new system domains quickly.

Preferred Skills

  • Exposure to AI/ML workloads or distributed systems.
  • Experience with simulation tools, performance modeling, or systems analysis.
  • Familiarity with data center infrastructure or large-scale systems.
  • Experience working with performance data, benchmarking, or profiling tools.
  • Interest in system architecture and hardware/software co-design.
Skills
Performance ModelingSimulation ToolsSystems ModelingPythonC++Distributed SystemsAI/ML WorkloadsBenchmarkingProfiling ToolsData Center Infrastructure
Similar roles at this salary range
All DevOps / SRE jobs →
Onebrief

Principal Infrastructure Engineer

Principal Infrastructure Engineer building and operating secure cloud-native and edge platforms for military collaboration software. Requires 8+ years production infrastructure experience, deep Kubernetes expertise, and ability to obtain SECRET clearance.

235k – 275kUnited StatesDevOps / SRERemoteGoAWS
Sentry

Staff Software Engineer, AI Developer Tooling

Own AI-assisted coding tooling at Sentry. Build harnesses, context systems, and API integrations so AI agents can operate across the full software development lifecycle.

240k – 320kSan Francisco, CADevOps / SREHybridCI/CDPython
Together AI

Staff Engineer, Distributed Storage and HPC & AI Infrastructure

Design and operate multi-petabyte distributed storage systems for large-scale AI training and inference, integrating parallel filesystems and building Kubernetes-native storage platforms.

250k – 300kSan Francisco, CADevOps / SREOn-siteGoCeph
Forge

Director of Platform & Reliability Engineering

The Director of Platform & Reliability Engineering will lead an engineering organization responsible for secure, scalable, and highly reliable products. This role involves setting the vision for internal platforms, cloud infrastructure, developer enablement, and production operations.

235k – 245kSan Francisco, CADevOps / SREHybridCI/CDKubernetes
Zoox

Staff Site Reliability Engineer

Zoox is seeking a Staff Site Reliability Engineer to lead source control, owning the technical strategy and roadmap for their Git-based monorepo. This role involves migrating from GitHub Enterprise to GitHub Cloud, building developer tooling, and partnering with various teams to enhance source control as a strategic asset.

250k – 300kFoster City, CADevOps / SREHybridBuckCI/CD