Data Scientist, Core Experimentation

Leads evolution of OpenAI's core experimentation platform, driving statistical strategy, designing methodologies, and building scalable Python/Spark pipelines to ensure reliable, trustworthy experiments at massive scale. Requires deep stats expertise, causal inference, and production experimentation experience.

293k – 325kBellevue, WASeattle, WAData ScienceHybrid

Apply

About the role

Responsibilities

Drive the statistical direction and technical strategy for OpenAI’s experimentation platform
Design and improve experimentation methodologies used across product and research teams
Build pragmatic solutions to real-world experimentation challenges, balancing rigor with operational simplicity
Improve the reliability and trustworthiness of experiment results, including detection and prevention of bias, logging issues, and data quality failures
Develop scalable analytical systems and pipelines in Python and distributed compute environments
Partner with engineers and product teams to improve experiment design, metric quality, and decision-making practices
Lead investigations into complex experimentation anomalies and measurement failures
Establish best practices for experimentation governance, interpretation, and statistical correctness
Mentor other data scientists and raise the overall technical bar for experimentation and causal inference

Requirements

Experience building, scaling, or operating experimentation platforms at a large technology company
Deep expertise in statistics, causal inference, and online experimentation methodology
Strong understanding of practical experimentation challenges in production systems
Experience with areas such as variance reduction, CUPED, sequential testing, SRM detection, metric design, or heterogeneous effects
Strong coding and systems skills in Python and large-scale data processing frameworks (e.g. Spark)
Experience designing analytical data models and scalable experimentation pipelines
Ability to communicate complex statistical concepts clearly to technical and non-technical audiences
Track record of influencing technical strategy through hands-on technical leadership

Nice-to-Haves

Experience in large-scale product experimentation, ML experimentation, ranking systems, marketplace systems, or similar high-scale experimentation domains

Compensation

$293K - $325K USD

Skills

PythonSparkStatisticsCausal InferenceOnline ExperimentationCupedSequential TestingSrm DetectionVariance ReductionMetric Design

Similar roles

Data Science jobs

OpenAI

Data Scientist, Integrity Measurement

Owns measurement, metrics, and analysis for trust & safety harms including prevalence estimation and response gaps using AI-first methods. Requires strong statistics, data programming (Python/R/SQL), and trust/safety experience.

293k – 385kSan Francisco, CA +1Data ScienceHybridRSQL

Anthropic

Data Scientist, Supply

Data Scientist focused on compute allocation and causal inference to optimize AI infrastructure decisions and connect supply choices to user outcomes. Requires strong Python/SQL skills and experience with constrained optimization and production systems.

285k – 460kSan Francisco, CA +1Data ScienceOn-site5+ YOESQLPython

OpenAI

Economist

Economist (up to 5 years post-PhD) conducting empirical research on AI’s economic impacts using large datasets, causal inference, and structural modeling. Requires PhD and strong econometrics/SQL/Python skills.

266k – 385kSan Francisco, CAData ScienceHybrid3+ YOERSQL

Anthropic

Research Economist, Economic Research

Measures AI's economic impacts through the Anthropic Economic Index using econometrics, ML, and novel data. Conducts empirical research on labor markets, productivity, inequality; requires PhD in Economics and strong empirical track record.

320k – 405kSan Francisco, CAData ScienceHybridRSQL

OpenAI

Data Scientist, Safety Systems

Lead data-driven safety evaluation for AI production systems by defining metrics, implementing statistical methods, building dashboards, and analyzing real-world impacts. Requires 5+ years in quantitative roles with leadership and strong stats expertise.

255k – 405kSan Francisco, CAData ScienceOn-site5+ YOESQLNLP