Software Engineer
Design, build, and operate large-scale infrastructure services and automation tooling. Requires 4 years of experience with distributed systems, Kubernetes, IaC, CI/CD, and cloud infrastructure.
Leads development of internal AI agent infrastructure ("Goose") to boost velocity across engineering, ops, and other teams. Builds safe, autonomous agent workflows for codebase inspection, testing, and complex tasks with strong focus on safety and accuracy.
Design, build, and operate large-scale infrastructure services and automation tooling. Requires 4 years of experience with distributed systems, Kubernetes, IaC, CI/CD, and cloud infrastructure.
Builds and improves CI/CD, testing, validation, and release tooling for OpenAI's inference runtime teams to ensure reliable, performant model deployments across ChatGPT, API, and research workloads. Requires strong Python skills, developer productivity experience, and high ownership in ambiguous environments.
Builds and operates high-performance networking infrastructure for OpenAI's large-scale AI training and inference, focusing on host networking, datacenter fabrics, and WAN systems. Optimizes latency, reliability, and scalability using technologies like RDMA, InfiniBand, and RoCE; requires strong systems programming in C++, Python, or Go.
Builds and improves developer tools, CI/CD pipelines, and testing workflows to boost productivity for OpenAI's model performance engineering teams. Requires strong Python skills, experience with developer infrastructure, and ability to work in ambiguous environments.
Enhances developer productivity for OpenAI's networking team by improving build systems, CI/CD pipelines, test harnesses, and workflows for C++ and Python codebases in multi-server environments. Requires experience with developer tools and infrastructure automation.