# Research Engineer, Judgment Systems
**Company:** [Variance](https://hotfix.jobs/companies/variance)
**Location:** San Francisco, CA
**Salary:** $250K-$400K
**Skills:** Machine Learning, LLMs, Fine-Tuning, Reinforcement Learning, Retrieval Augmented Generation, Agent Systems, Post-Training, Evaluation, Benchmarking, Python
**Posted:** 2026-03-31
> Research Engineer designs evaluations, studies model failures, and builds research loops to improve AI agents for high-stakes fraud detection and judgment tasks. Requires ML training experience, experimental rigor, and strong engineering skills in adversarial environments.
## Job Description
## Responsibilities
- Train, fine-tune, and improve models for fraud, scams, abuse, and other high-stakes judgment workflows
- Own research threads focused on improving agent capability, reliability, and decision quality
- Build proprietary benchmarks, datasets, and evals that reflect real customer workflows, regulatory constraints, and real failure modes
- Design and run experiments across post-training, retrieval, tool use, planning, memory, and long-horizon agent behavior
- Study where models break, why they break, and how to make them more robust
- Prototype new training strategies, agent architectures, and evaluation methods, then turn the best ideas into production systems
- Work closely with founders and engineering to translate research advances into deployed product capabilities
- Push the boundary of what AI agents can do in regulated industries

## Requirements
- Care deeply about protecting people from fraud, scams, and abuse
- Have strong opinions about model quality, evaluation, and experimental rigor
- Want to work on core model and agent behavior
- Excited to train, fine-tune, and improve models for hard real-world judgment tasks
- Think in tight research loops: hypothesis, experiment, evaluation, failure analysis, iteration
- Thrive in ambiguous, fast-moving environments where the path is not obvious and the feedback loop is short
- Motivated by the challenge of making AI systems work in adversarial, regulated, and high-consequence settings
- Want to help define what trustworthy AI means in real-world use cases

## Preferred Background
- Experience training, fine-tuning, or evaluating modern ML systems
- Strong programming skills and comfort working in research-heavy codebases
- Familiarity with **LLMs**, agent systems, post-training, **reinforcement learning**, retrieval, or adjacent areas
- Ability to design clean experiments and draw reliable conclusions from noisy results
- Strong engineering judgment and a bias toward building
- Interest in fraud, risk, trust and safety, compliance, or other regulated and adversarial domains

## Compensation & Benefits
- Competitive salary and meaningful equity ($250,000 - $400,000)
- Platinum-level medical, dental, and vision insurance
- Unlimited PTO, sick leave, and parental leave
- Up to $100 per month in reimbursement for personal health and wellness expenses
- 401(k) plan
**Apply:** https://hotfix.jobs/jobs/research-engineer-judgment-systems-at-variance-d22f5c9e-5811-409f-8801-81a01c43869a
**Canonical:** https://hotfix.jobs/jobs/research-engineer-judgment-systems-at-variance-d22f5c9e-5811-409f-8801-81a01c43869a