Senior Staff Site Reliability Engineer
Ironclad is seeking a Senior Staff Site Reliability Engineer to provide technical leadership and strategic direction for the SRE team, champion engineering excellence, and drive architectural resilience for their cloud platform.
Roles & Responsibilities:
- Provide technical leadership and strategic direction for the Site Reliability Engineering team and our broader Cloud Platform
- Define and champion SRE best practices, setting the standard for engineering excellence across the entire organization
- Solve the whole problem. Architecture for resiliency, identify risks, and make it happen.
- A proven track record of designing and driving an 'automate-everything' culture (build, test, deploy, monitor)
- Preference for collaboration, open communication and reaching across functional borders
- Thorough understanding of backup/recovery systems, cloud storage architecture, and distributed systems
- Be on an on-call rotation to respond to incidents that impact Ironclad’s availability, and provide support with internal or customer-facing incidents
- Translate the near, mid, and long-term strategic needs of the business into a scalable, resilient platform roadmap
- Drive critical architectural decisions with a relentless focus on security, scalability, and high performance
- Be a mentor, multiply our team’s output with leadership and guidance
Key Skills:
- 8+ years of DevOps / SRE experience
- 5+ years of coding experience
- Expert knowledge of Kubernetes and Google Cloud Platform (or similar provider)
- Ability to build resilient infrastructure
- Modern GitOps - Experience with tools like Terraform/Pulumi, CircleCI, ArgoCD
- Experience with modern AI enabled tools such as Claude Code, Cursor, Zed
- Troubleshooting and analytical skills, can PR review human and AI generated code
- Strong technical aptitude and exceptional communication skills (written and verbal)
- Desire for helping customers, and the ability to dive deep and learn a new product.
- Experience and desire to work cross-functionally
- Team and goal-oriented.
- High output; low ego
Bonus Points if you have:
- Experience with multi-region support
- Expert Database Management Experience
- Experience managing AI Infrastructure
- Typescript Experience
Base Salary Range:
- Staff Site Reliability Engineer: $220,000 - $235,000
- Senior Staff Site Reliability Engineer: $245,000 - $270,000
US Full-Time Employee Benefits at Ironclad:
- 100% health coverage for employees (medical, dental, and vision), and 75% coverage for dependents with buy-up plan options available
- Market-leading leave policies, including gender-neutral parental leave and compassionate leave
- Family forming support through Maven for you and your partner
- Paid time off - take the time you need, when you need it
- Monthly stipends for wellbeing, hybrid work, and (if applicable) cell phone use
- Mental health support through Modern Health, including therapy, coaching, and digital tools
- Pre-tax commuter benefits (US Employees)
- 401(k) plan with Fidelity with employer match (US Employees)
- Regular team events to connect, recharge, and have fun
- And most importantly: the opportunity to help build the company you want to work at
Staff Site Reliability Engineer, Release Engineering
Staff SRE on the Release Engineering team defining and scaling reliability practices, architecting SLO/error-budget programs, and driving progressive delivery and automated safety gates across product engineering.
Lead Voice Infrastructure Engineer
Lead the design and operation of scalable telephony infrastructure powering AI voice agents for accounts receivable workflows, including SIP trunking, call routing, realtime media, and integrations with speech systems.
Staff Production Operations Engineer
Staff-level role driving Redpanda's reliability operations program. Combines hands-on SRE with coordination of on-call, incident reviews, and AI-driven automation to improve global production reliability.
Staff Platform Engineer (Pacific Time Zone)
Lead technical direction for Komodo's core control plane (KMC/PSS, identity, subscriptions) and App Builder/Connector. Architect platform primitives, APIs, and AI tooling in a multi-tenant SaaS environment.