Staff Site Reliability Engineer
Founding Staff SRE for Kong's internal developer platform (Volcano). Define reliability posture, build multi-region Kubernetes infrastructure, establish GitOps/CI-CD, and scale managed data services.
Builds, operates, and scales infrastructure for web/AI products, focusing on reliability, cost-efficiency, and handling large language models. Requires expertise in distributed systems, cloud platforms, performance tuning, and scalable architectures.
Founding Staff SRE for Kong's internal developer platform (Volcano). Define reliability posture, build multi-region Kubernetes infrastructure, establish GitOps/CI-CD, and scale managed data services.
Owns and maintains C++ build systems for autonomous aircraft software, improves developer velocity by optimizing CI/CD pipelines, integrates testing with simulations, and implements monitoring to resolve issues quickly. Requires 7+ years experience with deep expertise in build tools and DevOps practices.
Designs, implements, and maintains scalable infrastructure using Kubernetes and Terraform. Architects GitOps pipelines, drives security initiatives, and mentors teams to enhance developer velocity and platform reliability. Requires 7+ years experience and bachelor's degree.
Integrates autonomy software stack for AI robotics platforms, including multi-agent systems, sensor processing, and hardware deployment across simulation, HIL, and flight environments. Requires 7+ years experience, Python/C++, CI/CD expertise, and strong systems integration skills.
Define and implement reliability systems for a growing AI cloud infrastructure platform, including architectural improvements, operational processes, monitoring, and incident response. Requires 5+ years production coding and 2+ years on-call experience with strong cloud skills.