Skip to content

Senior Software Engineer, Infrastructure

Designs, builds, and maintains scalable infrastructure for real-time telemetry platform supporting mission-critical systems. Requires 8+ years in distributed systems, cloud environments (AWS/GCP/Azure), Kubernetes, Docker, and DevOps tools.

170k – 220kMarina del Rey, CASan Francisco, CADevOps / SREHybrid8+ YOE

About the role

Responsibilities

  • Design, build, and maintain scalable, resilient infrastructure solutions to support our growing platform and customer base.
  • Collaborate with software engineers to optimize application performance and reliability.
  • Implement monitoring, alerting, and logging systems to ensure proactive identification and resolution of issues.
  • Automate deployment processes and streamline infrastructure management using modern DevOps tools and methodologies.
  • Evolve our backend architecture/infrastructure for both cloud and on-premise deployments.
  • Work with the team to set and prioritize our roadmap to maximize customer impact.
  • Lead initiatives to improve infrastructure reliability, performance, and cost efficiency.

Requirements

  • 8+ years of relevant distributed systems experience focusing on designing and managing cloud-based environments (e.g., AWS, Azure, GCP).
  • Hands-on experience with containerization technologies (Docker, Kubernetes) and container orchestration platforms.
  • Passion for building and operating developer productivity tools, frameworks, and other aspects of platform engineering.
  • Familiarity with CI/CD pipelines and version control systems.
  • Knowledge of automated testing best practices and frameworks, ensuring software reliability through integration, performance, and end-to-end testing in distributed systems.
  • Experience with our tech stack: Go, Java, React, TypeScript, Kafka/Redpanda, Flink, AWS, Azure, GCP, Docker, Kubernetes, Terraform/CDK or equivalent distributed systems.
  • Good knowledge of cloud, on-prem, networking, and service architecture in multi-region multi-cloud setups.

Engineering At SIFT (Relevant Technologies)

  • Web frontend & backend: ECharts, Go, gRPC, PostgreSQL, Protobuf, Radix, React, Redux, TypeScript.
  • Data: Arrow, DataFusion, Flink, Parquet, Rust.
  • Infrastructure: Argo CD, AWS, Docker, GitHub Actions, Grafana, Kubernetes, Kustomize, Linux, Prometheus, Terragrunt.

Compensation

  • Salary range: $170,000 - $220,000 per year. Plus equity and benefits.

Skills

KubernetesDockerAWSGCPAzureTerraformGoCI/CDPrometheusGrafanaArgo CdFlinkKafkaPostgresgRPC

Similar roles

DevOps / SRE jobs

Senior Software Engineer, Developer Productivity Cloud Infrastructure

Senior engineer focused on developer productivity and cloud infrastructure. Designs scalable internal tools, re-architects build systems, and improves CI/CD workflows using Terraform, Go/Python/C++.

170k – 240kSan Mateo, CADevOps / SREHybrid5+ YOEGoC++

Senior Software Engineer - Observability and Reliability

Build observability platforms and tools (metrics, logging, tracing, alerting) using Go, OpenTelemetry, and Kubernetes. Requires 5+ years experience building production software and strong CS fundamentals.

170k – 240kNew York, NYDevOps / SREOn-site5+ YOEGoGCP

Senior Software Engineer - Observability and Reliability

Build observability tools and platforms (metrics, logging, tracing, alerting) using Go, OpenTelemetry, and Kubernetes. Requires 5+ years experience building high-quality software that other engineers use.

170k – 240kSan Francisco, CADevOps / SREOn-site5+ YOEGoGCP

Sr. Site Reliability Engineer

Senior SRE responsible for reliability, scalability, and performance of AWS and Azure cloud infrastructure. Requires 5+ years SRE experience, strong cloud platform skills, and automation expertise.

170k – 196kSunnyvale, CADevOps / SREOn-site5+ YOEGoAWS

Sr. Site Reliability Engineer

Senior SRE responsible for reliability, scalability, and performance of AWS and Azure cloud systems. Requires 5+ years SRE experience, strong cloud infrastructure skills, and automation expertise.

170k – 196kSunnyvale, CADevOps / SREOn-site5+ YOEGoAWS