Senior Staff Site Reliability Engineer

245k – 270kSan Francisco, CANew York, NYChicago, ILDevOps / SREHybrid8+ YOEMay 29

Summary

Ironclad is seeking a Senior Staff Site Reliability Engineer to provide technical leadership and strategic direction for the SRE team, champion engineering excellence, and drive architectural resilience for their cloud platform.

About the role

Roles & Responsibilities:

Provide technical leadership and strategic direction for the Site Reliability Engineering team and our broader Cloud Platform
Define and champion SRE best practices, setting the standard for engineering excellence across the entire organization
Solve the whole problem. Architecture for resiliency, identify risks, and make it happen.
A proven track record of designing and driving an 'automate-everything' culture (build, test, deploy, monitor)
Preference for collaboration, open communication and reaching across functional borders
Thorough understanding of backup/recovery systems, cloud storage architecture, and distributed systems
Be on an on-call rotation to respond to incidents that impact Ironclad’s availability, and provide support with internal or customer-facing incidents
Translate the near, mid, and long-term strategic needs of the business into a scalable, resilient platform roadmap
Drive critical architectural decisions with a relentless focus on security, scalability, and high performance
Be a mentor, multiply our team’s output with leadership and guidance

Key Skills:

8+ years of DevOps / SRE experience
5+ years of coding experience
Expert knowledge of Kubernetes and Google Cloud Platform (or similar provider)
Ability to build resilient infrastructure
Modern GitOps - Experience with tools like Terraform/Pulumi, CircleCI, ArgoCD
Experience with modern AI enabled tools such as Claude Code, Cursor, Zed
Troubleshooting and analytical skills, can PR review human and AI generated code
Strong technical aptitude and exceptional communication skills (written and verbal)
Desire for helping customers, and the ability to dive deep and learn a new product.
Experience and desire to work cross-functionally
Team and goal-oriented.
High output; low ego

Bonus Points if you have:

Experience with multi-region support
Expert Database Management Experience
Experience managing AI Infrastructure
Typescript Experience

Base Salary Range:

Staff Site Reliability Engineer: $220,000 - $235,000
Senior Staff Site Reliability Engineer: $245,000 - $270,000

US Full-Time Employee Benefits at Ironclad:

100% health coverage for employees (medical, dental, and vision), and 75% coverage for dependents with buy-up plan options available
Market-leading leave policies, including gender-neutral parental leave and compassionate leave
Family forming support through Maven for you and your partner
Paid time off - take the time you need, when you need it
Monthly stipends for wellbeing, hybrid work, and (if applicable) cell phone use
Mental health support through Modern Health, including therapy, coaching, and digital tools
Pre-tax commuter benefits (US Employees)
401(k) plan with Fidelity with employer match (US Employees)
Regular team events to connect, recharge, and have fun
And most importantly: the opportunity to help build the company you want to work at

Skills

KubernetesGoogle Cloud PlatformTerraformPulumiCircleCIArgoCDClaude CodeTypescriptDatabase Management

Similar roles at this salary range

All DevOps / SRE jobs →

Plaid

Jun 19

Staff Site Reliability Engineer, Release Engineering

Staff SRE on the Release Engineering team defining and scaling reliability practices, architecting SLO/error-budget programs, and driving progressive delivery and automated safety gates across product engineering.

208k – 274kNew York, NYDevOps / SREHybrid8+ YOEGoSLO

Stuut

Jun 17

Lead Voice Infrastructure Engineer

Lead the design and operation of scalable telephony infrastructure powering AI voice agents for accounts receivable workflows, including SIP trunking, call routing, realtime media, and integrations with speech systems.

250k – 290kSan Francisco, CA +1DevOps / SREOn-site7+ YOECGo

Redpanda Data

Jun 15

Staff Production Operations Engineer

Staff-level role driving Redpanda's reliability operations program. Combines hands-on SRE with coordination of on-call, incident reviews, and AI-driven automation to improve global production reliability.

211k – 256kUnited StatesDevOps / SRERemote5+ YOEGoAWS

Komodo Health

Jun 15

Staff Platform Engineer (Pacific Time Zone)

Lead technical direction for Komodo's core control plane (KMC/PSS, identity, subscriptions) and App Builder/Connector. Architect platform primitives, APIs, and AI tooling in a multi-tenant SaaS environment.

196k – 275kUnited StatesDevOps / SRERemote7+ YOEAWSGDPR

Airbnb

Jun 15

Staff Software Engineer (Technical Lead), Storage

Staff-level infrastructure engineer leading teams that build and operate Airbnb's critical KV stores, caching layers, coordination services, and data ingestion pipelines at massive scale.

204k – 255kUnited StatesDevOps / SRERemote9+ YOECDCRedis

Apply