Skip to content

Site Reliability Engineer II

Manages multi-cloud infrastructure on Azure, AWS, and GCP for Illumio SaaS products, focusing on reliability, scalability, automation, and incident response. Requires 2+ years SRE/DevOps experience with Azure proficiency and scripting skills.

141k – 162kSunnyvale, CADevOps / SREOnsite2+ YOE

About the role

Responsibilities

  • Design, deploy, and maintain cloud infrastructure solutions on Azure, AWS, and/or GCP to support applications and services.
  • Implement infrastructure as code (IaC) principles using tools such as Terraform, ARM templates, or CloudFormation to automate provisioning and configuration management.
  • Develop and maintain CI/CD pipelines for automated software delivery and deployment, leveraging tools such as Azure DevOps, AWS CodePipeline, or Jenkins.
  • Monitor system performance, application health, and infrastructure metrics using cloud monitoring and logging services, and implement proactive measures to optimize performance and availability.
  • Support incident response and resolution efforts, conduct root cause analysis, implement corrective actions, and document post-incident reviews.
  • Collaborate with Engineering teams to design and implement scalable and reliable architectures, providing guidance on best practices for cloud-native application development.
  • Implement security best practices and controls in cloud environments to protect data, applications, and infrastructure, and ensure compliance with regulatory requirements.
  • Drive automation initiatives to streamline operational tasks, reduce manual effort, and improve overall efficiency in cloud operations.
  • Stay current with cloud platform updates, trends, and best practices, and evaluate emerging technologies for potential adoption.
  • Provide support and guidance to junior team members, fostering a culture of learning, collaboration, and continuous improvement.

Requirements

  • Bachelor's degree in Computer Science, Engineering, or related field; or equivalent work experience.
  • 2+ years of experience working as an SRE, DevOps Engineer, or similar role, with hands-on experience in Azure cloud platform in a production environment.
  • Exposure to AWS and/or GCP cloud platforms is preferred.
  • Proficiency in scripting and programming languages such as PowerShell, Python, or Go for automation and infrastructure management tasks.
  • Experience with CI/CD tools and methodologies, containerization technologies, and microservices architecture in cloud environments.
  • Strong analytical, problem-solving, and communication skills, with the ability to collaborate effectively with cross-functional teams.

Nice-to-Haves

  • Azure certifications such as Azure Administrator, Azure Developer, or AWS/GCP certifications.

Skills

AzureAWSGCPTerraformCI/CDAzure DevOpsJenkinsPowerShellPythonGoInfrastructure As CodeContainerizationMicroservices

Similar roles

DevOps / SRE jobs

Site Reliability Engineer II

Site Reliability Engineer II manages multi-cloud infrastructure on Azure, AWS, and GCP, implements IaC with Terraform, builds CI/CD pipelines, monitors systems, and drives automation for SaaS product reliability. Requires 2+ years SRE/DevOps experience, primarily in Azure, with scripting proficiency.

141k – 162kSunnyvale, CADevOps / SREOn-site2+ YOEGoAWS

Software Engineer - Infrastructure

Infrastructure engineer responsible for maintaining and scaling Kubernetes fleets, improving CI/CD, and making product-level code changes in Python or Go to support autonomous drone platform needs.

140k – 210kSan Mateo, CADevOps / SREHybrid2+ YOEGoSaaS

Forward Deployed Engineer

Technical generalist embeds with operations teams to identify high-impact problems and rapidly builds AI agents, automations, and tools to eliminate friction. Requires 2-5 years software engineering experience with Python/TypeScript proficiency and business impact focus.

140k – 200kSan Francisco, CADevOps / SREOn-site2+ YOESQLLLMs

Associate Systems Software Engineer

Develops Linux-based compute applications for managing virtualization stacks across AI compute servers, integrates with AI hardware like GPUs and NICs, and optimizes performance for AI/ML workloads in datacenters. Requires Linux kernel familiarity, systems programming, and hardware integration skills.

137k – 161kSan Francisco, CADevOps / SREOn-siteEntry levelCGo

Systems Engineer

Design, automate, and maintain reliable infrastructure across cloud and colocation environments. Build monitoring, self-service tools, and standards for distributed systems in a Linux-heavy stack.

137k – 164kNew York, NYDevOps / SREOn-site2+ YOEGoAWS