Skip to content

Data Scientist, Integrity Measurement

Owns measurement, metrics, and analysis for trust & safety harms including prevalence estimation and response gaps using AI-first methods. Requires strong statistics, data programming (Python/R/SQL), and trust/safety experience.

293k – 385kSan Francisco, CANew York, NYData ScienceHybrid

About the role

In this role, you will:

  • Own measurement and quantitative analysis for severe, actor- and network-based usage harm verticals.
  • Develop and implement AI-first methods for prevalence measurement and other productionised safety metrics, including off-platform indicators.
  • Build metrics for goaling or A/B tests.
  • Own dashboards and metrics reporting for harm verticals.
  • Conduct analyses to inform improvements to review, detection, or enforcement.
  • Optimise LLM prompts for measurement.
  • Collaborate with safety teams on policies.
  • Provide metrics for leadership and external reporting.
  • Develop automation leveraging agentic products.

You might thrive in this role if you:

  • Are a senior DS with trust and safety experience.
  • Have deep statistics skills, especially sampling methods and prevalence estimation.
  • Have experience with severe harm areas like child safety or violence.
  • Are an excellent communicator with strong cross-functional skills.
  • Are capable in data programming languages (R or Python, SQL).
  • (Ideally) have experience with AI harms or leveraging AI for measurement.

Skills

PythonRSQLStatisticsSampling MethodsPrevalence EstimationLlm PromptingDashboardsA/B TestingAI/ML

Similar roles

Data Science jobs

Data Scientist, Core Experimentation

Leads evolution of OpenAI's core experimentation platform, driving statistical strategy, designing methodologies, and building scalable Python/Spark pipelines to ensure reliable, trustworthy experiments at massive scale. Requires deep stats expertise, causal inference, and production experimentation experience.

293k – 325kBellevue, WA +1Data ScienceHybridSparkCuped

Data Scientist, Supply

Data Scientist focused on compute allocation and causal inference to optimize AI infrastructure decisions and connect supply choices to user outcomes. Requires strong Python/SQL skills and experience with constrained optimization and production systems.

285k – 460kSan Francisco, CA +1Data ScienceOn-site5+ YOESQLPython

Economist

Economist (up to 5 years post-PhD) conducting empirical research on AI’s economic impacts using large datasets, causal inference, and structural modeling. Requires PhD and strong econometrics/SQL/Python skills.

266k – 385kSan Francisco, CAData ScienceHybrid3+ YOERSQL

Research Economist, Economic Research

Measures AI's economic impacts through the Anthropic Economic Index using econometrics, ML, and novel data. Conducts empirical research on labor markets, productivity, inequality; requires PhD in Economics and strong empirical track record.

320k – 405kSan Francisco, CAData ScienceHybridRSQL

Data Scientist, Safety Systems

Lead data-driven safety evaluation for AI production systems by defining metrics, implementing statistical methods, building dashboards, and analyzing real-world impacts. Requires 5+ years in quantitative roles with leadership and strong stats expertise.

255k – 405kSan Francisco, CAData ScienceOn-site5+ YOESQLNLP