Forward Deployed SRE
Site Reliability Engineer owns reliability of multi-cloud Kubernetes infrastructure for AI/ML platform, builds observability tooling as code, automates mitigations, leads incident response, and defines SLOs/SLIs. Requires extensive Kubernetes and observability experience.