Software Engineer, ChatGPT Infrastructure
Design and build infrastructure platforms for ChatGPT, focusing on scalability, reliability, and developer productivity. Requires experience with large-scale distributed systems, performance optimization, and creating reusable abstractions for engineering teams.
Where You Can Have Impact
- Platform foundations & frameworks: core libraries, service frameworks, and shared components.
- Scalability & performance primitives: reduce tail latency, improve throughput, keep costs predictable.
- Reliability guardrails: rate limiting, load shedding, dependency isolation, backpressure, safe fallbacks.
- Developer productivity via golden paths: paved roads for common workflows.
- Observability & debugging systems: instrumentation, metrics, investigative tooling.
- Safe change management: deployment systems, progressive delivery, fast rollback.
- Interface and contract design: clean APIs and stable contracts.
What You’ll Do
- Build and evolve infrastructure platforms that many engineers and services depend on.
- Translate messy real-world constraints into clean abstractions: simple APIs, enforceable contracts, safe defaults.
- Drive improvements in reliability and performance through principled design, measurement, and iterative hardening.
- Partner across engineering and product to identify systemic pain points and turn them into reusable solutions.
- Own outcomes end-to-end: design → implementation → rollout → operational maturity.
Qualifications
Minimum Qualifications
- Experience building and operating large-scale distributed systems in production (high throughput, concurrency, and failure handling).
- Strong fundamentals in systems design, including caching, consistency, queueing/backpressure, and resilient dependency management.
- Ability to reason about performance (latency distributions, tail behavior, bottlenecks) and translate that into concrete engineering work.
- Track record of building platforms or shared infrastructure that improves velocity and correctness for other teams.
- Strong communication and collaboration skills—aligning on interfaces, navigating tradeoffs, and driving cross-team execution.
Preferred Qualifications
- Experience designing paved roads / golden paths (frameworks, libraries, self-serve tooling) that shape engineering behavior at scale.
- Deep understanding of reliability techniques: graceful degradation, circuit breakers, load shedding, rate limiting, and fault isolation.
- Experience building systems for safe iteration: progressive delivery, correctness checks, automated rollout gates, and production validation.
- Strong instincts for API and contract design—how to create interfaces that are stable, evolvable, and hard to misuse.
- Prior work that demonstrates “force multiplier” impact: enabling many teams through a small number of well-chosen primitives.
Staff Software Engineer, Growth AI
Staff Software Engineer anchoring AI-powered growth products across SEO and exploratory teams. Architect production ML systems, partner with ML orgs, and set technical direction as a senior IC.
Staff Backend Engineer, Search
Staff-level search engineer responsible for designing, scaling, and optimizing ClickUp's search infrastructure using OpenSearch/ElasticSearch, including real-time indexing, vector search, and relevance tuning.
Software Engineer, Cloud Agents
Build and scale orchestration, sandboxing, and storage systems for long-running cloud agents powering Codex, ChatGPT, and the OpenAI API. Requires 9+ years experience leading large-scale backend or infrastructure projects.