Site Reliability Engineer II

Manages multi-cloud infrastructure on Azure, AWS, and GCP for Illumio SaaS products, focusing on reliability, scalability, automation, and incident response. Requires 2+ years SRE/DevOps experience with Azure proficiency and scripting skills.

141k – 162kSunnyvale, CADevOps / SREOnsite2+ YOE

Apply

About the role

Responsibilities

Design, deploy, and maintain cloud infrastructure solutions on Azure, AWS, and/or GCP to support applications and services.
Implement infrastructure as code (IaC) principles using tools such as Terraform, ARM templates, or CloudFormation to automate provisioning and configuration management.
Develop and maintain CI/CD pipelines for automated software delivery and deployment, leveraging tools such as Azure DevOps, AWS CodePipeline, or Jenkins.
Monitor system performance, application health, and infrastructure metrics using cloud monitoring and logging services, and implement proactive measures to optimize performance and availability.
Support incident response and resolution efforts, conduct root cause analysis, implement corrective actions, and document post-incident reviews.
Collaborate with Engineering teams to design and implement scalable and reliable architectures, providing guidance on best practices for cloud-native application development.
Implement security best practices and controls in cloud environments to protect data, applications, and infrastructure, and ensure compliance with regulatory requirements.
Drive automation initiatives to streamline operational tasks, reduce manual effort, and improve overall efficiency in cloud operations.
Stay current with cloud platform updates, trends, and best practices, and evaluate emerging technologies for potential adoption.
Provide support and guidance to junior team members, fostering a culture of learning, collaboration, and continuous improvement.

Requirements

Bachelor's degree in Computer Science, Engineering, or related field; or equivalent work experience.
2+ years of experience working as an SRE, DevOps Engineer, or similar role, with hands-on experience in Azure cloud platform in a production environment.
Exposure to AWS and/or GCP cloud platforms is preferred.
Proficiency in scripting and programming languages such as PowerShell, Python, or Go for automation and infrastructure management tasks.
Experience with CI/CD tools and methodologies, containerization technologies, and microservices architecture in cloud environments.
Strong analytical, problem-solving, and communication skills, with the ability to collaborate effectively with cross-functional teams.

Nice-to-Haves

Azure certifications such as Azure Administrator, Azure Developer, or AWS/GCP certifications.

Skills

AzureAWSGCPTerraformCI/CDAzure DevOpsJenkinsPowerShellPythonGoInfrastructure As CodeContainerizationMicroservices

Similar roles

DevOps / SRE jobs

Illumio

Site Reliability Engineer II

Site Reliability Engineer II manages multi-cloud infrastructure on Azure, AWS, and GCP, implements IaC with Terraform, builds CI/CD pipelines, monitors systems, and drives automation for SaaS product reliability. Requires 2+ years SRE/DevOps experience, primarily in Azure, with scripting proficiency.

141k – 162kSunnyvale, CADevOps / SREOn-site2+ YOEGoAWS

Skydio

Software Engineer - Infrastructure

Infrastructure engineer responsible for maintaining and scaling Kubernetes fleets, improving CI/CD, and making product-level code changes in Python or Go to support autonomous drone platform needs.

140k – 210kSan Mateo, CADevOps / SREHybrid2+ YOEGoSaaS

Harper

Forward Deployed Engineer

Technical generalist embeds with operations teams to identify high-impact problems and rapidly builds AI agents, automations, and tools to eliminate friction. Requires 2-5 years software engineering experience with Python/TypeScript proficiency and business impact focus.

140k – 200kSan Francisco, CADevOps / SREOn-site2+ YOESQLLLMs

Crusoe

Associate Systems Software Engineer

Develops Linux-based compute applications for managing virtualization stacks across AI compute servers, integrates with AI hardware like GPUs and NICs, and optimizes performance for AI/ML workloads in datacenters. Requires Linux kernel familiarity, systems programming, and hardware integration skills.

137k – 161kSan Francisco, CADevOps / SREOn-siteEntry levelCGo

Yext

Systems Engineer

Design, automate, and maintain reliable infrastructure across cloud and colocation environments. Build monitoring, self-service tools, and standards for distributed systems in a Linux-heavy stack.

137k – 164kNew York, NYDevOps / SREOn-site2+ YOEGoAWS