Skip to content

Software Engineer, Infrastructure

Builds and maintains scalable infrastructure for real-time telemetry platform supporting mission-critical systems. Requires 3+ years in distributed systems, expertise in cloud (AWS/GCP/Azure), Kubernetes, Docker, and DevOps tools.

150k – 200kMarina del Rey, CASan Francisco, CADevOps / SREHybrid3+ YOE

About the role

Responsibilities

  • Design, build, and maintain scalable, resilient infrastructure solutions to support our growing platform and customer base.
  • Collaborate with software engineers to optimize application performance and reliability.
  • Implement monitoring, alerting, and logging systems to ensure proactive identification and resolution of issues.
  • Automate deployment processes and streamline infrastructure management using modern DevOps tools and methodologies.
  • Evolve our backend architecture/infrastructure for both cloud and on-premise deployments.
  • Work with the team to set and prioritize our roadmap to maximize customer impact.
  • Lead initiatives to improve infrastructure reliability, performance, and cost efficiency.

Requirements

  • 3+ years of relevant distributed systems experience focusing on designing and managing cloud-based environments (AWS, Azure, GCP).
  • Hands-on experience with containerization technologies (Docker, Kubernetes) and container orchestration platforms.
  • Passion for building and operating developer productivity tools, frameworks, and other aspects of platform engineering.
  • Familiarity with CI/CD pipelines and version control systems.
  • Knowledge of automated testing best practices and frameworks, ensuring software reliability through integration, performance, and end-to-end testing in distributed systems.
  • Good knowledge of cloud, on-prem, networking, and service architecture in multi-region multi-cloud setups.
  • Excited by the ambiguity and high-ownership culture of early-stage startups.
  • Pragmatic, solution-oriented, and scrappy.
  • Enjoy working collaboratively with a broad range of job functions and roles.

Nice-to-Haves / Tech Stack

  • Experience with: Go, Java, React, TypeScript, Kafka/Redpanda, Flink, Terraform/CDK.
  • Relevant technologies: ECharts, gRPC, PostgreSQL, Protobuf, Radix, Redux, Arrow, DataFusion, Parquet, Rust, Argo CD, GitHub Actions, Grafana, Kustomize, Linux, Prometheus, Terragrunt.

Compensation

  • Salary range: $150,000 - $200,000 per year.
  • Plus equity and benefits.

Skills

KubernetesDockerAWSGCPAzureTerraformGoPrometheusGrafanaCI/CDKafkaFlinkgRPCPostgresArgo Cd

Similar roles

DevOps / SRE jobs

Software Engineer - Networking Software and Services

Build software, services, and frameworks for network management, automation, and monitoring of large-scale GPU supercomputing fabrics. Requires deep network protocol knowledge and experience orchestrating tens of thousands of devices.

150k – 250kPalo Alto, CA +1DevOps / SREHybrid5+ YOEGoBGP

Software Engineer, Platform

Own infrastructure, CI/CD, and developer tooling for a fast-scaling AI-native ERP. Set technical direction for reliability, security, and API design in a hybrid NYC/SF environment.

150k – 270kNew York, NY +1DevOps / SREHybrid5+ YOEAWSCI/CD

Software Engineer, Enablement

Design, build, and operate AI-powered engineering tools and developer productivity platforms. Focus on AI pairing pipelines, automated workflows, and internal tooling to accelerate engineering velocity.

150k – 180kUnited StatesDevOps / SRERemote3+ YOEGoLLMs

Infrastructure Engineer

Flint is seeking an Infrastructure Engineer to own the systems powering their AI-generated pages at scale. This 0-to-1 role involves building production-grade cloud architecture, CI/CD, deployments, observability, and security, with a focus on managing parallel background agents.

150k – 250kSan Francisco, CADevOps / SREOn-siteAWSGCP

Infrastructure Engineer

Founding Infrastructure Engineer to architect and scale resilient systems for AI/ML workloads, implement monitoring/observability, and automate infrastructure. Requires 5+ years production experience, Python, Kubernetes, and strong reliability focus.

150k – 300kSan Francisco, CADevOps / SREOn-site5+ YOEPythonKubernetes