# Software Engineer, Compute Infrastructure
**Company:** [Glean](https://hotfix.jobs/companies/glean)
**Location:** Mountain View, CA
**Salary:** $140K-$220K
**Experience:** 5+ years
**Skills:** Kubernetes, GCP, AWS, Azure, Infrastructure As Code, Distributed Systems, Observability, SLOs, Autoscaling, Multi-Tenancy
**Posted:** 2026-06-08
> Build and operate Kubernetes-based compute and runtime infrastructure powering AI search, assistant, and agent workloads across multi-cloud environments. Own reliability, scalability, cost-efficiency, and on-call for production platform services.
## Job Description
## Responsibilities
- Design, build, and own backend/platform services that power Glean’s runtime infrastructure, with a focus on reliability, scalability, and performance for AI and search workloads.
- Develop and evolve Kubernetes-based runtime primitives (e.g., service orchestration, scheduling integrations, autoscaling patterns) across multi-cloud foundation (GCP, AWS, Azure).
- Collaborate with platform, data, and product engineering teams to make it easy and safe to spin up new services and batch workloads, with clear golden paths for deployment, configuration, and runtime operations.
- Drive end-to-end improvements in latency, resource utilization, and cost for core platform services, including multitenant runtime environments and experimental AI workloads.
- Implement and harden infrastructure-as-code patterns, observability, and guardrails so teams can confidently ship and run services in production (e.g., SLOs, dashboards, alerts, safe rollout/rollback).
- Partner with the Costs and Runtime teams to build shared mechanisms for attribution, guardrails, and automation that keep the runtime layer efficient as the company scales.
- Participate in an on-call rotation for critical platform services, lead incident response when needed, and translate learnings into better reliability, tooling, and documentation.
- Contribute to technical direction for Runtime Infra: help define roadmaps around multitenancy, autoscaling, capacity/placement, and platformized patterns.

## Requirements
- Strong distributed systems fundamentals and experience operating high-throughput, low-latency services or batch pipelines in production environments.
- Comfortable owning systems end-to-end: design, implementation, testing, deployment, observability, and ongoing operations.
- Experience thinking in terms of reliability and guardrails: SLOs, incident response, safe deployment strategies, and clear operational runbooks.
- Pragmatic and execution-oriented: balance ideal architectures with the constraints of a fast-moving startup and ship iterative improvements.
- Clear communication with both infra and product engineers; enjoy collaborating across teams to understand requirements and translate them into platform capabilities.
- Excited to work in a multi-cloud, multi-tenant environment and help define best practices for running AI workloads efficiently at scale.

## Nice-to-Haves
- Experience with Kubernetes-based runtime systems and multi-cloud infrastructure (GCP, AWS, Azure).
- Background in cost-efficient, low-latency execution for production services and pipelines.
**Apply:** https://hotfix.jobs/jobs/software-engineer-compute-infrastructure-at-glean-e0a3b148-84c9-4527-a285-8b82c6473c63
**Canonical:** https://hotfix.jobs/jobs/software-engineer-compute-infrastructure-at-glean-e0a3b148-84c9-4527-a285-8b82c6473c63