Staff Software Engineer, Developer Productivity
Staff-level IC role owning end-to-end CI/CD, merge queue, and deploy pipelines for Anthropic's engineering org. Focus on AI-assisted review, test reliability, and progressive delivery at monorepo scale.
Key responsibilities
- Own the build, test, merge, and deploy pipeline end to end — what runs on each PR, what auto-approves, what gates merge, and how a change progresses to running healthy in production
- Drive down and defend "time from push to healthy in prod" as a core engineering metric
- Design and tune AI-assisted code review so confidence-to-land scales with PR volume
- Build the deploy and release path — canary, progressive rollout, health checks, automated rollback — in partnership with the platform teams who own the underlying substrate
- Improve test reliability by quarantining, root-causing, and retiring intermittent failures
- Shape CI and repository topology (build graph, test targeting, scope boundaries) to match how the company actually ships
- Partner with platform, delivery infrastructure, and security teams, and represent Developer Productivity in cross-org pipeline decisions
- Design processes (postmortem review, incident response, on-call) that help the team operate reliably and never fail the same way twice
Minimum qualifications
- Significant backend or developer-infrastructure engineering experience, with hands-on responsibility for a high-leverage CI/CD, merge queue, or land pipeline at scale
- Proficiency in Python and at least one statically-typed systems language (e.g., Go or Rust)
- Experience operating CI/CD or release systems through production incidents, including writing postmortems and driving remediations
- Demonstrated ability to work across team boundaries — building consensus with platform, security, and product engineering stakeholders
- Comfort using AI coding tools as a daily part of your workflow, with informed opinions on where they provide leverage
Preferred qualifications
- 7+ years of backend or developer-infrastructure experience
- Experience with Bazel or similar build-graph / test-targeting systems at monorepo scale
- Experience with progressive delivery or release engineering at scale (canary analysis, automated rollback, health-gated promotion)
- A track record of leading — or making the well-reasoned case against — a repo split, monorepo extraction, or comparable scope-boundary migration
- A history of authoring engineering policy or paved-path tooling that other teams adopted voluntarily
- Familiarity with Kubernetes, Buildkite, GitHub Actions, or comparable CI/deploy substrates
- Interest in the safe and beneficial development of AI
Representative projects
- Reducing p50 merge-to-production time by re-architecting the merge queue and test selection strategy
- Building an AI-assisted review layer that auto-approves low-risk changes and routes high-risk ones to the right reviewers
- Designing a flaky-test quarantine and burndown system that returned CI signal to >99% reliability
- Standing up canary and progressive rollout for a service fleet, with automated rollback on health regression
- Authoring the RFC and migration plan for a build-graph or repository topology change adopted across multiple teams
Network Engineer, Supercomputing
Own and debug multi-thousand-GPU network fabric (RDMA/RoCE, NVLink/NVSwitch) for large-scale AI training and inference. Requires backend language proficiency, large-scale cluster experience, and cross-stack ownership.
Staff Software Engineer, Developer Productivity
Staff-level engineer to own end-to-end development environments at Anthropic, focusing on container lifecycle, cold-start optimization, environment isolation, and pre-push validation for AI researchers and engineers.
Staff Software Engineer, Node Infra
Own technical strategy and roadmap for node lifecycle management, health automation, and scaling AI clusters across clouds and accelerators. Requires deep distributed systems expertise, ML accelerator experience, and 12+ years leading complex multi-team infrastructure initiatives.
Staff Software Engineer, Kubernetes Platform
Senior-level engineer to own and scale Anthropic's massive Kubernetes control plane and scheduler for training frontier AI models across hundreds of thousands of nodes. Requires deep Kubernetes internals experience and 12+ years building production distributed systems.