Software Engineer, Kernel Performance & AI Tooling
Develops kernel performance optimizations, AI-assisted tooling, and observability infrastructure for AI-native hardware. Requires strong low-level systems experience, kernel/accelerator expertise, and familiarity with AI workflows for engineering acceleration.
Responsibilities
- Build developer tooling and workflows that make kernel development and performance optimization faster, more scalable, and easier to debug, integrate, and deploy.
- Develop observability, diagnostics, and validation infrastructure that makes AI-assisted optimization systems more interpretable, reliable, and effective.
- Optimize production kernels end to end by formulating optimization problems, running search loops, analyzing bottlenecks, debugging generated implementations, and landing improvements into production.
- Design abstractions, interfaces, and automation systems that accelerate kernel optimization, correctness validation, and hardware-software co-design.
- Improve AI-assisted optimization systems for specialized tasks through better datasets, evaluations, benchmarking, and research infrastructure.
- Partner across research and engineering teams to turn new ideas into practical systems spanning production needs and long-term infrastructure strategy.
Requirements
- Strong systems or tooling engineering experience, with a background in low-level software, performance optimization, or infrastructure.
- Experience with developer tooling, debugging infrastructure, profiling, observability, or workflow design for technical users.
- Depth in kernel development, accelerator architecture, compiler systems, or related performance-critical domains.
- Familiarity with AI-assisted systems, agentic workflows, post-training, or reinforcement learning for engineering or research applications.
- Strong experimental judgment, comfort with ambiguity, and the ability to move fluidly between research exploration and production execution.
- Interest in compilers, DSLs, program synthesis, or AI for systems.
Preferred
- Strong systems and tooling engineer with real depth in kernels and accelerators.
- Comfortable working across software and hardware boundaries, can reason deeply about performance, abstractions, and system design.
- Hands-on experience optimizing code for GPUs, high-performance CPUs, or custom accelerators.
- View AI not as the end product, but as a force multiplier for engineering productivity and system optimization.
Principal Infrastructure Engineer
Principal Infrastructure Engineer building and operating secure cloud-native and edge platforms for military collaboration software. Requires 8+ years production infrastructure experience, deep Kubernetes expertise, and ability to obtain SECRET clearance.
Staff Engineer, Distributed Storage and HPC & AI Infrastructure
Design and operate multi-petabyte distributed storage systems for large-scale AI training and inference, integrating parallel filesystems and building Kubernetes-native storage platforms.
Director of Platform & Reliability Engineering
The Director of Platform & Reliability Engineering will lead an engineering organization responsible for secure, scalable, and highly reliable products. This role involves setting the vision for internal platforms, cloud infrastructure, developer enablement, and production operations.
Staff Site Reliability Engineer
Zoox is seeking a Staff Site Reliability Engineer to lead source control, owning the technical strategy and roadmap for their Git-based monorepo. This role involves migrating from GitHub Enterprise to GitHub Cloud, building developer tooling, and partnering with various teams to enhance source control as a strategic asset.