Skip to content

Principal Data Scientist, Health Informatics

128k – 229kUnited StatesRemote7+ YOE
Summary

Principal Data Scientist owning clinical data quality and building production ML/AI models for healthcare risk stratification and outcomes using claims, EHR, and ADT data. Requires 7+ years Python/ML experience and deep fluency with clinical terminologies and healthcare data standards.

About the role

Responsibilities

  • Own clinical data quality across claims, EHR, and ADT: Define standards for how clinical data is structured, normalized, and validated as modeling inputs across payer claims (medical, pharmacy, eligibility), EHR data (Epic, Cerner, Athena), and real-time ADT feeds.
  • Bring deep familiarity with EHR data formats (FHIR, HL7, C-CDA) and how data from systems like Epic, Cerner, and Athena maps to clinical reality.
  • Build and ship production ML/AI models: Develop, validate, and deploy risk stratification, care gap prediction, treatment effect estimation, and LLM/foundation model applications — with rigor around leakage, calibration, fairness, and clinical face validity.
  • Apply health economics and outcomes methods: Translate raw clinical and claims data into decision-grade evidence through risk adjustment, utilization measurement, cost attribution, quasi-experimental evaluation, and outcomes measurement aligned with CMS, NCQA, and MCO reporting standards.
  • Advance machine and AI products: Bring senior modeling judgment to the product roadmap, owning the clinical and methodological soundness of what ships.
  • Set standards and mentor: Make architectural trade-offs, drive alignment across data science, engineering, product, and clinical stakeholders, and mentor junior data scientists to raise the technical bar of the team.

Requirements

  • Deep, hands-on fluency with claims, EHR, and ADT data, and strong command of clinical terminologies (ICD-10, SNOMED CT, LOINC, RxNorm, CPT/HCPCS) and value set curation.
  • Working experience with healthcare data standards and exchange formats — FHIR, HL7v2, and C-CDA.
  • Master's degree in Data Science, Biostatistics, Health Informatics, Computer Science, or a related field.
  • 7-8+ years of hands-on experience in Python, including data science and ML libraries.
  • Demonstrated ability to build, validate, and deploy production ML models on healthcare data, with end-to-end ownership from development through deployment and maintenance in a live environment.
  • Experience with ML pipelines, model versioning, and reproducible workflows at scale.
  • Proven ability to manage complex technical projects independently, align multiple stakeholders, and deliver on timelines.

Nice-to-Haves

  • PhD in health informatics, statistics, data science, or computer science.
  • Experience integrating EHR/HIE data via TEFCA, CommonWell, or comparable networks.
  • Experience with risk adjustment, utilization and cost measurement, and quasi-experimental evaluation.
  • Familiarity with MLOps best practices including experiment tracking and model registry (e.g. MLflow), CI/CD for ML pipelines, feature stores, and workflow orchestration tools such as SageMaker Pipelines.
  • Prior experience building on Medicaid or dual-eligible populations.
  • Peer-reviewed publications in healthcare ML, AI, biostatistics, or health economics.

Compensation & Benefits

  • Stock Options: Opportunity to invest in the company's growth.
  • Work-from-Home Stipend: A dedicated stipend for your first year to help set up your home office.
  • Medical, Vision, and Dental Coverage: Comprehensive plans to keep you and your family healthy.
  • Life Insurance: Basic life insurance to give you peace of mind.
  • Paid Time Off: 20 vacation days, accrued over the year, plus 11 paid holidays.
  • Parental Leave: 16 weeks of paid leave for birthing parents after six months of employment, and 8 weeks of bonding leave for non-birthing parents.
  • Retirement Savings: Access to a 401(k) plan with a company contribution, subject to a vesting schedule.
  • Commuter Benefits: Convenient options to support your commute needs.
  • Professional Development Stipend: A dedicated stipend supports professional development and growth.
Skills
PythonMachine LearningFHIRHL7C-CDAICD-10SNOMED CTLOINCRxNormCPT/HCPCSEHRClaims DataRisk AdjustmentMLOpsMLflow
Similar roles at this salary range
All Data Science jobs →
Waymark

Senior Data Scientist

Lead AI/ML model development and deployment for clinical workflows, patient targeting, and intervention optimization. Build Python data pipelines, supervise junior data scientists, and publish research. Requires a Master's degree and 1+ year of relevant experience.

134k – 202kSan Francisco, CAData ScienceRemoteSQLGit
Cordial

Data Scientist - Production Engineering

Data Scientist focused on operationalizing, optimizing, and scaling production data science models and pipelines. Requires 3+ years production experience, strong Python skills, and expertise in AWS, data pipelines, and data warehouses.

140k – 175kUnited StatesData ScienceRemoteS3AWS
Ramp

Data Scientist, All Levels

As a Data Scientist, you will lead the future of analytics at Ramp by developing data products and insights. You will partner with business stakeholders and product, engineering, and design counterparts to prioritize and execute work, improve reporting, and drive results.

137k – 297kNew York, NY +1Data ScienceHybridSQLHex
Databricks

Senior Data Scientist

Senior Data Scientist building data-driven products and product analytics at Databricks. Works with PM, Sales, and CS stakeholders on usage forecasting and funnel analysis; mentors junior data scientists.

123k – 218kMountain View, CA +1Data ScienceOn-siteRSQL
Atomicsemi

Applied Mathematician

Own the math, algorithms, and simulation backbone of optical metrology tools; design and run hardware experiments to characterize sensitivity and robustness. Requires 2+ years R&D experience, deep expertise in linear algebra, numerical methods, optimization/estimation/controls, optical simulation, and Python.

120k – 160kAustin, TXData ScienceOn-sitePythonData Analysis