Skip to content

Infrastructure Engineer

150k – 250kSan Francisco, CADevOps / SREOnsite
Summary

Infrastructure Engineer scales ML inference systems serving 150+ biological models using Kubernetes and AWS. Requires containerization expertise, cloud knowledge, and onsite presence in San Francisco.

About the role

Responsibilities

  • Architect and maintain infrastructure serving 150+ biological ML models
  • Scale platform several orders of magnitude to meet growing demand
  • Orchestrate containerized workloads using Kubernetes
  • Optimize resource allocation and ensure high availability
  • Work closely with founders on customer needs, unpredictable workloads, and Bio-ML models

Requirements

  • Solid programming and automation skills
  • Experience with containerization and orchestration concepts
  • Cloud platform knowledge (AWS/GCP/Azure)
  • Located in the SF Bay Area or able to relocate

Preferred

  • Experience scaling production systems
  • Kubernetes experience
  • Infrastructure as code tools (Terraform, Pulumi)
  • Monitoring and observability tools
  • Experience with GPU workloads

Tech Stack: Python, React, AWS (EC2, S3, DynamoDB), Docker, CUDA, Conda, TensorFlow/PyTorch, notebooks, bash/Slurm, APIs & web apps

Skills
KubernetesAWSDockerTerraformPythonGPUCUDAPulumiTensorFlowPyTorch
Similar roles at this salary range
All DevOps / SRE jobs →
Northwood Space

Senior Network Engineer

Design, deploy, and operate enterprise network infrastructure for corporate facilities and hybrid cloud environments with zero-trust architecture and compliance requirements. Requires 5+ years enterprise networking experience and ability to obtain TS/SCI clearance.

133k – 215kLos Angeles, CA +1DevOps / SREOn-site5+ YOEAWSVLAN
Fivetran

Senior Site Reliability Engineer

Senior SRE responsible for production infrastructure reliability, incident response, deployment automation, and scaling SaaS systems on Kubernetes and major cloud platforms.

175k – 210kOakland, CADevOps / SREHybrid5+ YOEAWSGCP
Forterra

Senior Software Engineer-Internal Tools

Senior Software Engineer on the DevOps and Tooling team building internal tools. Requires 3-5+ years experience, Rust or strong systems background, TypeScript/React, Linux, Docker, and CI/CD.

125k – 140kArlington, VA +1DevOps / SREOn-site5+ YOEAWSRust
Beacon AI

Software Engineer, Cloud Infrastructure

Build and operate AWS cloud infrastructure and LLM platform services including RAG pipelines, vector search, model endpoints, and data ingestion for an aviation AI company.

135k – 260kSan Carlos, CADevOps / SREHybrid4+ YOEAWSGlue
MongoDB

Site Reliability Engineer

Senior or Staff Site Reliability Engineer focused on continuous delivery infrastructure using Argo Workflows, ArgoCD, and Kubernetes. Owns deployment tooling, onboarding flows, and participates in 24/7 on-call. Requires 6+ years building and operating distributed systems.

127k – 249kBoston, MA +6DevOps / SREHybrid6+ YOEGoAWS