Responsibilities
- Develop monitoring techniques and observability methods that track AI behavior in real time to identify and flag deviations, emergent capabilities, or anomalous outputs.
- Research mechanisms for layered control, including fail-safes, oversight protocols, and intervention methods that can halt or redirect AI systems when risks are detected.
- Design red-team simulations to probe weaknesses in oversight and control mechanisms, and build mitigations to close identified gaps.
- Collaborate with policymakers, engineers, and other researchers to establish standards and benchmarks for AI monitoring and escalation.
Requirements
- Commitment to promoting safe, secure, and trustworthy AI deployments.
- Practical experience conducting technical research collaboratively, designing control and monitoring experiments for AI systems, building prototype systems, and turning research ideas into working prototypes.
- Track record of published research in machine learning, particularly in generative AI.
- At least three years of experience addressing sophisticated ML problems in research or product development.
- Strong written and verbal communication skills for cross-functional teams.
Nice to Have
- Experience with runtime monitoring, anomaly detection, or observability for ML systems.
- Familiarity with AI control or alignment research (e.g., scalable oversight, interpretability, debate).
- Experience with post-training and RL techniques such as RLHF, DPO, GRPO, and similar approaches.
Compensation
Base salary range: $197,400 - $246,750 USD (San Francisco, New York, Seattle), plus equity and benefits including health coverage, retirement, learning stipend, PTO, and commuter stipend.