Skip to content

Research Engineer, Domain Scaling

1 – 2San Francisco, CANew York, NYML EngineeringHybrid
Summary

Own end-to-end data strategy and RL environment creation for domain-specific knowledge work (finance, healthcare, legal). Combine applied research with hands-on data sourcing, vendor management, and model performance measurement.

About the role

Responsibilities

  • Own the data strategy for knowledge work verticals end-to-end, from task sourcing through RL training
  • Manage technical relationships with external data vendors, including evaluation of data quality and reward design
  • Collaborate with domain experts to design data pipelines and evaluations
  • Explore novel ways of creating RL envs for high value tasks
  • Develop and improve QA frameworks to catch reward hacking and ensure env quality
  • Run generalization experiments to measure how data strategy changes improve model capabilities
  • Partner with other RL research teams and product teams to translate capability goals into training envs and evals

Requirements

  • Experience with fine-tuning large language models for specific domains or real-world use cases
  • Experience with reinforcement learning, reward design, or training data curation for LLMs
  • Comfortable managing technical vendor relationships and iterating quickly on feedback
  • Value in reading through datasets to understand them and spot issues
  • Strong cross-functional collaboration skills
  • Passionate about making AI more useful and accessible across different industries
  • Excited about a role that includes a combination of applied research and hands-on data work

Nice-to-Haves

  • Experience training production ML systems
  • Experience designing evals or benchmarks for LLMs
  • Domain expertise in a vertical where models would be more useful
  • Experience working with external vendors or technical partners

Education

  • Bachelor’s degree or an equivalent combination of education, training, and/or experience in a field relevant to the role

Compensation

  • Annual Salary: $1—$2 USD
Skills
Reinforcement LearningFine-tuning LLMsReward DesignTraining Data CurationData PipelinesQA FrameworksModel EvaluationVendor ManagementCross-functional CollaborationApplied Research