# Senior Site Reliability Engineer Cloud Platform
**Company:** [Zilliz](https://hotfix.jobs/companies/zilliz)
**Location:** Redwood City, CA
**Salary:** $175K-$225K
**Experience:** 4+ years
**Skills:** Python, Go, Java, Kubernetes, Docker, AWS, GCP, Azure, Terraform, Ansible, Jenkins, Gitlab Ci, Argo, Milvus
**Posted:** 2025-05-15
> Senior SRE focuses on ensuring reliability, availability, and performance of distributed database systems in cloud-native environments. Requires 4+ years experience with Kubernetes, Docker, cloud platforms (AWS/GCP/Azure), IaC tools, and scripting in Python/Go/Java.
## Job Description
## Responsibilities
- Work at the intersection of development and site reliability, creating SRE tools and systems while supporting existing infrastructure and platforms.
- Ensure the reliability, availability, and performance of Zilliz’s distributed database systems.
- Develop and implement strategies for monitoring, incident management, and disaster recovery.
- Automate system operations and maintenance tasks to improve efficiency and reduce manual intervention.
- Design and build tools to manage and monitor infrastructure, ensuring scalability and robustness.
- Collaborate with software engineers to enhance system reliability, scalability, and performance.
- Maintain and improve the CI/CD pipeline to ensure smooth and rapid deployment of changes.
- Actively contribute to the Milvus Vector Database open-source community, focusing on improving reliability and operational efficiency.

## Requirements
- 4+ years of experience in site reliability engineering or similar roles with a focus on cloud-native systems.
- Proficiency in scripting languages such as **Python**, **Go**, or **Java**.
- Strong knowledge of container orchestration technologies like **Kubernetes** and **Docker**.
- Expertise with cloud platforms such as **AWS**, **GCP**, or **Azure**, and their respective monitoring and management tools.
- Experience with infrastructure as code tools such as **Terraform** or **Ansible**.
- Familiarity with CI/CD tools such as **Jenkins**, **GitLab CI**, or **Argo**.
- Proven ability to troubleshoot complex distributed systems and resolve issues promptly.
- **Bachelor’s degree** or above in computer science, software engineering, or other relevant disciplines.
- Ability to thrive in a fast-paced, startup environment and handle multiple projects simultaneously.

## Nice-to-Haves
- Experience with Open Source Milvus Vector Database.
**Apply:** https://hotfix.jobs/jobs/senior-site-reliability-engineer-cloud-platform-at-zilliz-52337191-b610-4852-949c-b728a37995f9
**Canonical:** https://hotfix.jobs/jobs/senior-site-reliability-engineer-cloud-platform-at-zilliz-52337191-b610-4852-949c-b728a37995f9