Skip to content

Senior DevOps Engineer

Builds and maintains scalable infrastructure, CI/CD pipelines, and observability for AI-powered developer tools. Requires 5+ years DevOps/SRE experience, Kubernetes/Docker expertise, and cloud/IaC proficiency.

225k – 260kSan Francisco, CADevOps / SREHybrid5+ YOE

About the role

Responsibilities

  • Design, implement, and maintain scalable CI/CD pipelines
  • Develop and manage infrastructure as code (e.g., Terraform, Pulumi)
  • Improve system reliability through monitoring, alerting, logging, and failover strategies
  • Work with platform and backend teams to identify and resolve performance bottlenecks
  • Contribute to deployment workflows, environment automation, and developer tooling
  • Ensure infrastructure security and compliance practices are in place

Qualifications

Education: Degree in Computer Science, Engineering, or a related technical field, or equivalent practical experience

Experience: 5+ years of experience in a DevOps, Infrastructure, or SRE role at a fast-paced tech company or startup

Tooling: Expert-level proficiency with CI/CD systems (GitHub Actions, ArgoCD, etc.), Docker, and Kubernetes

Infrastructure: Expert with cloud providers (AWS/GCP), distributed systems architecture and implementation, IaC tools (Terraform, Pulumi), and secrets management (Vault, SSM, etc.)

Observability: Strong understanding of logging, metrics, and monitoring in large-scale distributed systems (e.g., Grafana, Prometheus, ELK, Datadog)

Collaboration: Effective at partnering with backend and ML teams to deliver stable, high-velocity systems

Security: Experience building with best practices in cloud and application-level security

Bonus Points

  • Experience supporting AI or ML workloads in production
  • Experience with ephemeral environments and preview deployments
  • Contributions to internal platform tools or DevOps open-source projects
  • Past ownership of high-uptime systems or regulated environments

Skills

KubernetesDockerTerraformPulumiGitHub ActionsArgo CDAWSGCPPrometheusGrafanaDatadogVault

Similar roles

DevOps / SRE jobs

Senior Software Integration Engineer

Leads end-to-end integration of agentic platform tools with clients, backends, and cloud ops on Kubernetes. Requires 7+ years experience, strong Python, HTTP/auth expertise, and cross-functional collaboration for reliable tool contracts and releases.

225k – 249kFoster City, CADevOps / SREHybrid7+ YOEJWTTLS

Senior Software Engineer, Platform Team

Designs and builds distributed systems for scheduling, workflows, and storage abstractions powering research and trading in hybrid environments. Requires 5+ years experience in scalable services, modern languages like Python/Go, and Linux with strong system design skills.

225k – 255kBerkeley, CA +1DevOps / SRERemote5+ YOEGoC++

Senior Platform Engineer

Leads development of DataHub's ingestion framework, building scalable metadata systems, APIs, and event-driven architectures for enterprise AI and data platforms. Requires 4+ years in distributed systems and advanced Python expertise.

225k – 300kPalo Alto, CADevOps / SREHybrid4+ YOEdbtPython

Senior Software Engineer, Execution Engineering

Develops production trading systems and data pipelines for machine learning in finance, designing real-time distributed systems, integrating markets, owning observability, and leading cross-team projects. Requires 5+ years in scalable distributed systems and cloud expertise.

225k – 255kBerkeley, CA +1DevOps / SRERemote5+ YOEAWSGCP

Senior Platform Engineer, Operator

Designs, builds, and maintains enterprise-grade Kubernetes operator in Rust for SMBP, focusing on API architecture, networking, security, observability, multi-cloud deployment, and operational tooling. Collaborates with customers to meet production Kubernetes needs.

223k – 259kAtlanta, GA +3DevOps / SRERemoteOLMRust