# Software Engineer
**Company:** [Polymath](https://hotfix.jobs/companies/polymath)
**Location:** Remote
**Skills:** Reinforcement Learning, Simulation Environments, Python, Infrastructure, Tooling, Debugging, Verifiers, Benchmarks
**Posted:** 2026-04-14
> Build simulation environments, tasks, and verifiers to train and evaluate long-horizon autonomous AI agents using reinforcement learning. Requires strong engineering fundamentals, high agency, and focus on robust systems.
## Job Description
## Responsibilities
- Build diverse, high-fidelity environments that test agents in realistic settings
- Design complex tasks that require long-horizon reasoning and tool use
- Develop robust verifiers that reliably measure agent performance
- Improve infrastructure and tooling to run, debug, and improve environments
- Work closely with the research team to identify failure modes and turn them into new tasks and benchmarks

## Requirements
- Strong engineering fundamentals
- Enjoy building from first principles and solving open-ended technical problems
- High agency and a strong bias toward shipping
- High quality bar and care about building robust systems
**Apply:** https://hotfix.jobs/jobs/software-engineer-at-polymath-d625fbc2-8bed-46e0-9be5-25a399cc055a
**Canonical:** https://hotfix.jobs/jobs/software-engineer-at-polymath-d625fbc2-8bed-46e0-9be5-25a399cc055a