Skip to content

Senior Staff Site Reliability Engineer

245k – 270kSan Francisco, CANew York, NYChicago, ILDevOps / SREHybrid8+ YOE
Summary

Ironclad is seeking a Senior Staff Site Reliability Engineer to provide technical leadership and strategic direction for the SRE team, champion engineering excellence, and drive architectural resilience for their cloud platform.

About the role

Roles & Responsibilities:

  • Provide technical leadership and strategic direction for the Site Reliability Engineering team and our broader Cloud Platform
  • Define and champion SRE best practices, setting the standard for engineering excellence across the entire organization
  • Solve the whole problem. Architecture for resiliency, identify risks, and make it happen.
  • A proven track record of designing and driving an 'automate-everything' culture (build, test, deploy, monitor)
  • Preference for collaboration, open communication and reaching across functional borders
  • Thorough understanding of backup/recovery systems, cloud storage architecture, and distributed systems
  • Be on an on-call rotation to respond to incidents that impact Ironclad’s availability, and provide support with internal or customer-facing incidents
  • Translate the near, mid, and long-term strategic needs of the business into a scalable, resilient platform roadmap
  • Drive critical architectural decisions with a relentless focus on security, scalability, and high performance
  • Be a mentor, multiply our team’s output with leadership and guidance

Key Skills:

  • 8+ years of DevOps / SRE experience
  • 5+ years of coding experience
  • Expert knowledge of Kubernetes and Google Cloud Platform (or similar provider)
  • Ability to build resilient infrastructure
  • Modern GitOps - Experience with tools like Terraform/Pulumi, CircleCI, ArgoCD
  • Experience with modern AI enabled tools such as Claude Code, Cursor, Zed
  • Troubleshooting and analytical skills, can PR review human and AI generated code
  • Strong technical aptitude and exceptional communication skills (written and verbal)
  • Desire for helping customers, and the ability to dive deep and learn a new product.
  • Experience and desire to work cross-functionally
  • Team and goal-oriented.
  • High output; low ego

Bonus Points if you have:

  • Experience with multi-region support
  • Expert Database Management Experience
  • Experience managing AI Infrastructure
  • Typescript Experience

Base Salary Range:

  • Staff Site Reliability Engineer: $220,000 - $235,000
  • Senior Staff Site Reliability Engineer: $245,000 - $270,000

US Full-Time Employee Benefits at Ironclad:

  • 100% health coverage for employees (medical, dental, and vision), and 75% coverage for dependents with buy-up plan options available
  • Market-leading leave policies, including gender-neutral parental leave and compassionate leave
  • Family forming support through Maven for you and your partner
  • Paid time off - take the time you need, when you need it
  • Monthly stipends for wellbeing, hybrid work, and (if applicable) cell phone use
  • Mental health support through Modern Health, including therapy, coaching, and digital tools
  • Pre-tax commuter benefits (US Employees)
  • 401(k) plan with Fidelity with employer match (US Employees)
  • Regular team events to connect, recharge, and have fun
  • And most importantly: the opportunity to help build the company you want to work at
Skills
KubernetesGoogle Cloud PlatformTerraformPulumiCircleCIArgoCDClaude CodeTypescriptDatabase Management
Similar roles at this salary range
All DevOps / SRE jobs →
Plaid

Staff Site Reliability Engineer, Release Engineering

Staff SRE on the Release Engineering team defining and scaling reliability practices, architecting SLO/error-budget programs, and driving progressive delivery and automated safety gates across product engineering.

208k – 274kNew York, NYDevOps / SREHybrid8+ YOEGoSLO
Stuut

Lead Voice Infrastructure Engineer

Lead the design and operation of scalable telephony infrastructure powering AI voice agents for accounts receivable workflows, including SIP trunking, call routing, realtime media, and integrations with speech systems.

250k – 290kSan Francisco, CA +1DevOps / SREOn-site7+ YOECGo
Redpanda Data

Staff Production Operations Engineer

Staff-level role driving Redpanda's reliability operations program. Combines hands-on SRE with coordination of on-call, incident reviews, and AI-driven automation to improve global production reliability.

211k – 256kUnited StatesDevOps / SRERemote5+ YOEGoAWS
Komodo Health

Staff Platform Engineer (Pacific Time Zone)

Lead technical direction for Komodo's core control plane (KMC/PSS, identity, subscriptions) and App Builder/Connector. Architect platform primitives, APIs, and AI tooling in a multi-tenant SaaS environment.

196k – 275kUnited StatesDevOps / SRERemote7+ YOEAWSGDPR
Airbnb

Staff Software Engineer (Technical Lead), Storage

Staff-level infrastructure engineer leading teams that build and operate Airbnb's critical KV stores, caching layers, coordination services, and data ingestion pipelines at massive scale.

204k – 255kUnited StatesDevOps / SRERemote9+ YOECDCRedis