Skip to content

Site Reliability Engineer

Site Reliability Engineer enhances system observability, reliability, and availability at a prediction markets platform. Builds automation, optimizes cloud infrastructure (Kubernetes, Docker, Terraform), debugs issues, and participates in on-call rotations. Requires 4+ years software engineering experience.

100k – 250kNew York, NYDevOps / SREOnsite4+ YOE

About the role

What You’ll Do

  • Improve observability, reliability, and service availability by defining and measuring key metrics
  • Build automation and systems that eliminate toil and reduce operational burden
  • Collaborate with core infrastructure engineers to performance-tune and optimize cloud deployments (Docker, Terraform, Kubernetes, EC2, etc.)
  • Partner with product teams to minimize service disruptions and automate incident response
  • Identify and analyze reliability problems across the stack, designing and implementing software for significant, long-term improvements
  • Mentor engineers and drive a culture where reliability is a core engineering value
  • Write high-quality, well-tested code that supports internal and external customer needs
  • Debug complex technical issues and improve system usability, operability, and diagnosability
  • Review feature designs across the company and ensure security, safety, scalability, and architectural clarity
  • Build and maintain integrations with third-party vendors
  • Participate in on-call rotations to troubleshoot and resolve urgent issues

What You Bring

  • 4+ years of software engineering experience
  • Experience designing, building, scaling, and maintaining production services and service-oriented architectures
  • Strong system design, coding, debugging, performance-tuning, and observability skills
  • High-quality coding practices with strong testing discipline
  • Excellent written and verbal communication; comfort working transparently across teams
  • Strong interpersonal skills across junior-to-principal engineering levels
  • Ability to think clearly under pressure and dive into any layer of the stack
  • Passion for building an open financial system that connects the world
  • Willingness to participate in on-call rotations and swiftly resolve issues

Bonus Points:

  • Experience designing highly reliable, high-throughput, low-latency systems
  • Experience with Datadog
  • Experience with Rust, Go, and Terraform
  • Experience with AWS, GCP, or Azure
  • Experience operating in regulated environments
  • Experience writing training materials or company-facing engineering content

NYC Pay Transparency

Salary: $100,000–$250,000 annually, plus equity and benefits.

Skills

KubernetesDockerTerraformDatadogAWSGCPAzureRustGoEC2

Similar roles

DevOps / SRE jobs

Data Center Operations, Network Technician Lead

Leads on-site network and structured cabling troubleshooting in AI data centers as Tier 2 escalation, handling fiber faults, repairs, validation, and tooling deployment. Requires 3+ years datacenter experience with fiber tools and basic network diagnostics.

100k – 150kAbernathy, TX +1DevOps / SREOn-site3+ YOEVflSQL

Software Engineer - Infrastructure

Builds and maintains edge and cloud infrastructure for IoT devices and AI video security platform, including AWS provisioning, Kubernetes orchestration, CI/CD pipelines, and observability. Requires 3+ years in AWS IaC, Docker/K8s, and Python/Go.

100k – 180kSunnyvale, CADevOps / SREOn-site3+ YOEGoAWS

Site Reliability Engineer

Owns digital infrastructure for AI research, managing compute access, auto-scaling, resource visibility, and reproducibility using Kubernetes and observability tools. Requires systems intuition, operational rigor, and pragmatism for experimental workloads.

100k – 300kEmeryville, CADevOps / SREHybridDockerPython

Infrastructure Engineer

Designs, builds, and scales infrastructure for a prediction market exchange including AWS, Kubernetes, high-performance APIs, and clearing systems. Requires 3+ years experience with strong fundamentals in cloud, containers, and DevOps tooling.

100k – 250kNew York, NYDevOps / SREOn-site3+ YOEAWSRds

Infrastructure Engineer

Builds and maintains automated IT infrastructure pipelines using Python, Terraform, and cloud providers to support company operations. Requires 5+ years experience with focus on automation, strong coding, and collaboration skills.

100k – 150kSeattle, WADevOps / SREOn-site5+ YOEAWSGCP