Skip to content

Senior Platform Operations Engineer

Senior Platform Operations Engineer owns platform infrastructure on AWS EKS, implements GitOps with ArgoCD, builds reliability practices with SLIs/SLOs, and drives observability using Datadog and CloudWatch. Requires 6+ years SRE/DevOps experience and Kubernetes expertise.

172k – 195kSan Diego, CADevOps / SRERemote6+ YOE

About the role

Responsibilities

  • Own the Platform Infrastructure: Manage and scale container environment on Amazon EKS, implement GitOps workflows using ArgoCD, maintain CI/CD pipelines through GitHub Actions.
  • Build for Reliability: Define and track SLIs/SLOs, lead incident response including on-call rotations, root cause analysis, post-mortems, contribute to disaster recovery planning.
  • Drive Observability: Design and maintain monitoring and logging stack using Datadog, Sentry, and CloudWatch.
  • Shape the Platform's Future: Collaborate on architectural decisions, build internal tooling and self-service workflows.

Requirements

  • 6+ years in SRE, DevOps, or Cloud Infrastructure; 2+ years in Senior role.
  • Confident with core AWS services (VPC, IAM, EKS, RDS) and cloud networking/security best practices.
  • Expert in Infrastructure as Code with Terraform, CloudFormation, or Crossplane.
  • Proficient with GitHub and GitHub Actions for CI/CD and automation.
  • Experienced with production Kubernetes clusters, GitOps (ArgoCD/Flux), Helm Charts.
  • Proficient with observability tooling: Datadog, Sentry, CloudWatch, Grafana.
  • Experience writing Python scripts for automation.
  • Comfortable working independently in remote setup.
  • Bachelor’s degree in Computer Science, Engineering, or equivalent.

Nice to Haves

  • Certifications: AWS, Kubernetes, Terraform, or Python.

Benefits

  • Competitive pay with equity options.
  • Stellar health care plan (Medical, Dental & Vision), FSA, DCFSA, HSA.
  • Company-sponsored disability & life insurance.
  • Unlimited PTO.
  • 401(k) + 4% Matching.
  • Fully remote + flexible hours.
  • $750 work-from-home setup budget.
  • Paid biannual in-person summits.
  • Quarterly $150 co-hanging stipend.
  • Monthly $100 health and wellness benefit.
  • Generous paid family leave.
  • Annual $1,200 learning & development stipend.

Skills

Amazon EksArgo CDGitHub ActionsTerraformKubernetesAWSDatadogSentryCloudWatchPythonGitOpsHelmGrafanaCloudFormationCrossplane

Similar roles

DevOps / SRE jobs

Senior Production Engineer, Operational Excellence

Senior Production Engineer ensures reliability, scalability, and performance of GPU cloud infrastructure powering AI workloads. Drives observability, incident response, automation, and operational improvements in large-scale distributed systems.

172k – 209kSan Francisco, CA +1DevOps / SREOn-site5+ YOEGoAWS

Senior Virtualization Validation Engineer

Validates large-scale multi-node GPU clusters using QEMU and Cloud Hypervisor, focusing on interconnects like NVLink/InfiniBand, collective communications (NCCL/RCCL), and performance in virtualized AI/HPC environments. Requires 5+ years experience, virtualization expertise, and Linux kernel knowledge.

173k – 210kSan Francisco, CA +1DevOps / SREOn-site5+ YOEKvmQemu

Senior Software Engineer, Developer Productivity Cloud Infrastructure

Senior engineer focused on developer productivity and cloud infrastructure. Designs scalable internal tools, re-architects build systems, and improves CI/CD workflows using Terraform, Go/Python/C++.

170k – 240kSan Mateo, CADevOps / SREHybrid5+ YOEGoC++

Senior Software Engineer - Observability and Reliability

Build observability platforms and tools (metrics, logging, tracing, alerting) using Go, OpenTelemetry, and Kubernetes. Requires 5+ years experience building production software and strong CS fundamentals.

170k – 240kNew York, NYDevOps / SREOn-site5+ YOEGoGCP

Senior Software Engineer - Observability and Reliability

Build observability tools and platforms (metrics, logging, tracing, alerting) using Go, OpenTelemetry, and Kubernetes. Requires 5+ years experience building high-quality software that other engineers use.

170k – 240kSan Francisco, CADevOps / SREOn-site5+ YOEGoGCP