# Product Reliability Engineer
**Company:** [PointOne](https://hotfix.jobs/companies/pointone)
**Location:** New York, NY
**Salary:** $100K-$160K
**Experience:** 2+ years
**Skills:** AWS, AWS Lambda, SQS, Rds, CloudWatch, Go, TypeScript, Observability, Telemetry, Logging, Alerting, Dashboards, Distributed Systems, Debugging, Incident Response
**Posted:** 2026-04-10
> Owns end-to-end system reliability, incident response, observability, and proactive stability improvements in a serverless AWS environment. Requires 2+ years software engineering with production-facing experience, strong debugging, and hands-on AWS/Go/TypeScript skills.
## Job Description
## Reliability & Incident Response
- Respond quickly to automated alerts and customer-reported issues
- Triage, diagnose, and resolve production incidents with a bias toward permanent fixes over workarounds
- Build and maintain incident response playbooks and postmortem processes
- Coordinate cross-functionally with customer success managers and key account stakeholders to maintain customer trust in the event of an incident

## Observability & Prevention
- Design and instrument telemetry, logging, and alerting across our serverless AWS stack
- Build dashboards and health metrics that surface issues before customers feel them
- Identify recurring failure patterns and drive systemic fixes into the codebase
- Reduce operational toil through automation

## Product Stability
- Contribute directly to the codebase—improving resilience, reducing tech debt, and creating automation to ensure bugs are resolved quickly and with little human intervention
- Partner with engineers on new feature launches to assess reliability risks before they ship
- Make data-driven recommendations on where to invest in stability

## What We're Looking For
- 2+ years of software engineering experience, with meaningful time spent in reliability, platform, or production-facing roles
- Strong debugging instincts and comfort tracing failures across distributed systems using logs, traces, and metrics
- Hands-on experience with AWS (Lambda, SQS, RDS, CloudWatch or equivalent)
- Comfortable reading and writing Go, TypeScript, or similar backend languages
- Experience building or improving observability infrastructure (alerting, dashboards, telemetry)
- High ownership mentality: you close the loop, you write the postmortem, you ship the fix
- **Strong plus**: experience in legaltech, fintech, healthtech, or other high-sensitivity, always-on environments
**Apply:** https://hotfix.jobs/jobs/product-reliability-engineer-at-pointone-2751264f-66b1-47ac-b52c-6267a2875aec
**Canonical:** https://hotfix.jobs/jobs/product-reliability-engineer-at-pointone-2751264f-66b1-47ac-b52c-6267a2875aec