# Product Manager, Public Sector GenAI Test & Evaluation (T&E)
**Company:** [Scale AI](https://hotfix.jobs/companies/scale-ai)
**Location:** San Francisco, CA, St. Louis, MO, New York, NY, Washington, DC
**Salary:** $154K-$257K
**Experience:** 3+ years
**Skills:** LLMs, RAG, Autonomous Agents, Software Engineering, Systems Architecture, Ai Evaluation, Technical Project Management, Linear, Ml Research, Infrastructure
**Posted:** 2026-04-24
> Defines vision and roadmap for GenAI test & evaluation infrastructure in public sector, traversing engineering orgs to resolve bottlenecks and drive execution for government agentic applications. Requires 3+ years technical experience in engineering or program management with AI evaluation expertise.
## Job Description
## Minimum Qualifications

- **Engineering Depth**: 3+ years of experience in software engineering, systems architecture, or highly technical program management. Must be able to read code, understand system architecture, and participate in technical design reviews alongside engineering teams.
- **Evaluation Systems Expertise**: Proven experience designing, owning the roadmap for, or operating the infrastructure required to continuously measure, improve, and show the performance of AI applications.
- **Problem Distillation**: Demonstrated experience taking a vaguely defined problem (e.g., \"our evaluation cycles are too slow\") and delivering a technical roadmap, resource requirements, and measurable success metrics within a narrow time window.
- **Ambiguity Management**: Proven track record of taking a project from \"stalled/undefined\" to \"shipped\" in a high-pressure environment. Point to at least two instances where you inherited a failing project and saw it through to production.
- **Cross-Functional Leadership**: Led multiple projects that required direct alignment between at least three distinct engineering organizations (e.g., Infrastructure, ML Research, and Product).
- **Operational Execution**: Experience using technical project management frameworks (e.g., Linear) to provide consistent weekly reporting on delivery velocity and blockers to executive stakeholders.

## Preferred Qualifications

- **Security Clearance**: Active Secret, Top Secret, or TS/SCI clearance.
- **GenAI Implementation**: Practical experience developing or evaluating features built specifically on LLMs, RAG, or autonomous agent workflows.
- **Technical Rigor**: Advanced degree in Computer Science, Engineering, or a related field.
- **Public Sector Expertise**: 2+ years of experience working with DoD, IC, or Civil agencies on mission-critical software deployments.

## Compensation

Base salary range varies by location:
- San Francisco, New York: $205,600—$257,000 USD
- Hawaii, Washington DC, Texas, Colorado: $184,800—$231,000 USD
- St. Louis: $154,400—$193,000 USD

Includes equity, comprehensive health/dental/vision, retirement, learning stipend, PTO, commuter stipend.
**Apply:** https://hotfix.jobs/jobs/product-manager-public-sector-genai-test-evaluation-t-e-at-scale-ai-46b55c3c-7d3d-4745-b9bd-0e81de2e4e00
**Canonical:** https://hotfix.jobs/jobs/product-manager-public-sector-genai-test-evaluation-t-e-at-scale-ai-46b55c3c-7d3d-4745-b9bd-0e81de2e4e00