# Staff Infrastructure Software Engineer, Enterprise AI
**Company:** [Scale AI](https://hotfix.jobs/companies/scale-ai)
**Location:** New York, NY, San Francisco, CA
**Salary:** $216K-$270K
**Experience:** 5+ years
**Skills:** Kubernetes, Terraform, Helm, Datadog, Prometheus, Grafana, AWS, Azure, GCP, Python
**Posted:** 2026-02-12
> Builds and scales multi-cloud infrastructure for enterprise AI Agentic workflows, focusing on security, compliance, observability, and developer tools. Requires 5+ years experience with modern infra practices, cloud providers, and languages like Python.
## Job Description
## What You'll Do:
- **Define the architectural patterns** for our multi-cloud infrastructure to support secure, reliable, and scalable Agentic workflows for enterprise customers.
- **Lead the infrastructure roadmap** with a strong focus on compliance, privacy, and security standards, including designing change management and data isolation strategies.
- **Own the development and maintenance** of our best-in-class Agentic observability platform (logging, metrics, tracing, and analytics) to proactively ensure system health and enable rapid incident response.
- **Drive developer efficiency** by building automated tooling and championing Infrastructure-as-Code (IaC) paradigms throughout the engineering organization.
- **Solve the toughest engineering problems** related to multi-tenancy, data isolation, and high-performance inference at a massive scale, taking end-to-end ownership across the full product lifecycle.

## What We're Looking For:
- **Proven experience** in a senior role, with **5+ years** of full-time software engineering experience.
- **Deep understanding of modern infrastructure practices**, including CI/CD, IaC (e.g., Terraform, Helm Charts), container orchestration (e.g., Kubernetes) and observability platforms (e.g., Datadog, Prometheus, Grafana).
- **Extensive experience with at least one major cloud provider** (AWS, Azure, or GCP).
- **Strong knowledge of security and compliance** in enterprise environments, with a focus on access management, data isolation, and customer-specific VPC setups.
- **Proficiency in Python or JavaScript/TypeScript, and SQL**.
- **Bonus points**: Hands-on experience and a passion for working with Agents, LLMs, vector databases, and other emerging AI technologies.
**Apply:** https://hotfix.jobs/jobs/staff-infrastructure-software-engineer-enterprise-ai-at-scale-ai-223e9053-6d4a-46f8-8f3a-badb573e6569
**Canonical:** https://hotfix.jobs/jobs/staff-infrastructure-software-engineer-enterprise-ai-at-scale-ai-223e9053-6d4a-46f8-8f3a-badb573e6569