User Researcher, AI Evaluations

196k – 230kSan Francisco, CANew York, NYHybrid5+ YOEJun 18

Summary

UX Researcher defining and scaling evaluation frameworks for Notion's AI-powered experiences. Focuses on establishing rubrics for model output quality and end-to-end user experience, running longitudinal studies, and operationalizing evaluation processes with product and data science teams.

About the role

What You'll Achieve

Define what “good” looks like (frameworks & rubrics)

Establish clear, reusable evaluation criteria that reflect real user expectations—helpfulness, trust, tone, control, and transparency
Translate qualitative insight into scoring guidance that can be applied consistently across teams and over time

Run recurring evals (longitudinal & feature-specific)

Run recurring longitudinal and feature-specific surveys and studies to measure experience quality over time against defined rubrics
Lead qualitative studies, side-by-side comparisons, and human-in-the-loop evaluation efforts
Help teams spot regressions, benchmark improvements, and understand when expectations shift

Anchor evaluation in real workflows (context > isolated feedback)

Ensure evals reflect jobs-to-be-done, user intent, and the full interaction journey (goal setting, delegation, review, iteration)
Help teams understand who is evaluating, what they’re trying to do, and why outputs succeed or fail

Identify failure modes & recovery behavior (guardrails)

Uncover breakdowns, regressions, and edge cases across the system—from model behavior to UI and integrations
Study how people notice issues, correct them, and continue their work
Turn insights into actionable guidance for guardrails, fixes, and prioritization

Operationalize evaluation with partners (process & tooling)

Collaborate closely with Product, Design, Engineering, and Data Science to align on target use cases
Build scalable evaluation loops (human-in-the-loop review, longitudinal studies, and calibration of automated/LLM-judge approaches against human judgment)

Skills You'll Need to Bring

Ability to operationalize insight into measurement: turning “soft” user expectations (trust, tone, usefulness, clarity) into concrete rubrics, scoring guidelines, and observable metrics
AI fluency and systems thinking: hands-on with AI products, reasoning about how model behavior, uncertainty, and system constraints shape user experience
Experience evaluating AI-enabled products (LLMs, agents, generative UI/workflow automation) and working with Data Science/ML partners on measurement strategy and evaluation tooling
Clear communication and impact orientation: aligning diverse partners around shared definitions of quality, creating artifacts that enable teams to act consistently
Strong UX research craft (quant + qual): choosing the right methods for the question—interviews, benchmarking, surveys, experiments—and synthesizing into actionable guidance
Pragmatism in fast-moving environments: prioritizing ruthlessly, working through ambiguity, balancing scrappy iteration with deep dives

Experience

5+ years doing UX research in industry

Nice to Haves

Familiarity with LLM-as-judge methods, prompt design for evaluators, or “golden dataset” creation
Experience using AI research tooling for rapid synthesis and communication (e.g., Dovetail, Listen Labs, Maze, Outset, etc.), as well as AI observability tooling like Braintrust
Experience using data querying languages (e.g., SQL), scripting languages (e.g., Python), or statistical/mathematical software (e.g., R, SAS, Matlab, etc.)
Master’s or PhD in HCI, Psychology, Behavioral Science, Anthropology, Sociology, or a related field

Skills

UX ResearchQualitative ResearchQuantitative ResearchAI Product EvaluationLLM EvaluationRubric DevelopmentHuman-in-the-Loop EvaluationData Science CollaborationPythonSQL

Similar roles at this salary range

All UX Research jobs →

Notion

Jun 18

User Researcher

Conduct end-to-end user research on AI-powered product features, partnering with Design, Product, and Engineering to deliver actionable insights that shape Notion's AI experiences. Requires 3+ years of UX research experience and strong AI fluency.

164k – 190kSan Francisco, CA +1UX ResearchHybrid3+ YOEAI FluencyUX Research

Glean

Jun 5

Senior User Researcher, Enterprise & Platform

Senior User Researcher focused on admin, IT, and builder personas for enterprise platform experiences. Conducts mixed-methods research to improve deployment, governance, and adoption of AI-powered enterprise tools.

185k – 210kSan Francisco, CAUX ResearchHybrid5+ YOEJTBDB2B Research

Glean

Jun 5

Senior User Researcher, Enterprise & Platform

185k – 210kMountain View, CAUX ResearchHybrid5+ YOEJTBDB2B Research

Fetch

May 29

Principal UX Researcher, Consumer

As a Principal UX Researcher, Consumer, you will lead strategic research initiatives, deeply understand user behavior, and influence product and business strategy for the Fetch app. You will be responsible for end-to-end research, from identifying opportunities to delivering impactful insights.

207k – 244kUnited StatesUX ResearchRemote8+ YOERAI

Abridge

May 22

User Research Lead

Lead user research strategy and execution for an AI healthcare platform, managing a small team while partnering with product and design leadership to shape product direction based on deep insights from clinicians and patients.

211k – 263kSan Francisco, CAUX ResearchHybrid8+ YOEUser ResearchTeam Leadership

Apply