# Member of Technical Staff - Reliability Engineering
**Company:** [Modal](https://hotfix.jobs/companies/modal)
**Location:** New York, NY, San Francisco, CA
**Salary:** $150K-$350K
**Experience:** 5+ years
**Skills:** AWS, Kubernetes, Auto Scaling, Fleet Management, Capacity Planning, Monitoring Systems
**Posted:** 2026-01-19
> Define and implement reliability systems for a growing AI cloud infrastructure platform, including architectural improvements, operational processes, monitoring, and incident response. Requires 5+ years production coding and 2+ years on-call experience with strong cloud skills.
## Job Description
## Requirements
- 5+ years of experience writing high-quality production code.
- 2+ years of on-call experience for critical production services.
- Strong cloud skills, and deep familiarity with at least one hyperscaler cloud (AWS preferred).
- Familiarity with auto scaling, fleet management, and capacity planning at scale.
- Experience owning and scaling Kubernetes clusters to thousands of nodes a plus.
- Experience with systems safety research (e.g. STAMP) and control theory a plus.
- Ability to work in-person in our NYC, SF or Stockholm offices.
**Apply:** https://hotfix.jobs/jobs/member-of-technical-staff-reliability-engineering-at-modal-2385e01a-468a-43f5-8063-d8b03079ceb5
**Canonical:** https://hotfix.jobs/jobs/member-of-technical-staff-reliability-engineering-at-modal-2385e01a-468a-43f5-8063-d8b03079ceb5