Skip to content

Infrastructure Engineer

Builds and scales highly available infrastructure using AWS, Terraform, and Docker to support rapid growth and AI workloads. Collaborates with product and research teams on architectures, CI/CD, monitoring, and performance optimization.

130k – 500kSan Francisco, CADevOps / SREOnsite

About the role

What You’ll Work On

  • Designing and maintaining core infrastructure across cloud environments.
  • Building Infrastructure-as-Code workflows to automate deployments and scaling.
  • Improving monitoring, logging, and alerting systems to maintain reliability.
  • Managing CI/CD pipelines (Github, Spacelift) to ensure seamless deployments.
  • Supporting disaster recovery planning and ensuring high availability of systems.
  • Partnering with product and research teams to design architectures that scale with workload demands.
  • Identifying and fixing performance bottlenecks in compute, storage, and networking.

What We’re Looking For

  • Strong experience with cloud platforms (AWS).
  • Proficiency with Infrastructure-as-Code tools (Terraform).
  • Deep experience with containers and orchestration (Docker).
  • Solid understanding of distributed architectures.
  • Programming languages (Python, Go).
  • Proven ability to ship reliable infrastructure in production environments.
  • Growth mindset and eagerness to thrive in a hyper-growth, high-ownership environment.

Benefits

  • Generous equity grant vested over 4 years
  • A $20K relocation bonus (if moving to the Bay Area)
  • A $10K housing bonus (if you live within 0.5 miles of our office)
  • A $1K monthly stipend for meals
  • Free Equinox membership
  • Health insurance

Skills

AWSTerraformDockerPythonGoKubernetesCI/CDGitHubSpaceliftInfrastructure As Code

Similar roles

DevOps / SRE jobs

Linux Systems Engineer (USA)

Hands-on Linux Systems Engineer builds and maintains bare-metal servers, manages storage like ZFS, automates with Ansible and Bash, and ensures production reliability. Requires 3+ years Linux experience, physical server management, and on-call rotation with data center travel.

130k – 150kStamford, CT +1DevOps / SREOn-site3+ YOEZfsBash

Software Engineer - Developer Infrastructure

Develop and maintain developer tooling and infrastructure for Nominal's platform, scaling across air-gapped, cloud, and on-prem environments. Requires 4+ years experience with cloud services, Docker, Kubernetes, CI/CD, and ability to mentor engineers.

130k – 230kNew York, NY +2DevOps / SREOn-site4+ YOEAWSGCP

Vault Application Engineer/Administrator (Hashicorp)

Designs, deploys, and manages HashiCorp Vault clusters for secure secret management in on-premises and cloud (AWS/GCP) hybrid environments with Kubernetes integration. Requires 3+ years experience, zero trust principles, IaC tools like Terraform, and automation scripting.

130k – 180kBethesda, MDDevOps / SREHybrid3+ YOEAWSGCP

Site Reliability Engineer

Owns production reliability for critical systems, builds SRE function from scratch, introduces modern practices like SLIs/SLOs and error budgets. Requires 5+ years SRE experience with large-scale distributed systems.

130k – 500kSan Francisco, CADevOps / SREOn-site5+ YOEAWSIac

Software Engineer, Compute Platform

Build and optimize Replit's cloud infrastructure for scalable application deployment, focusing on reliability, cost efficiency, and global performance using distributed systems expertise.

130k – 290kFoster City, CADevOps / SREHybridGoGCP