Skip to content

AI Research Resident

AI Research Resident collaborates on research projects developing benchmarks and environments for long-horizon AI agents, identifying model failure modes, and training autonomous agents. Requires current MS/PhD enrollment, RL experience, systems engineering, and strong publications.

200k – 200kSan Francisco, CAAI ResearchRemoteEntry level

About the role

Responsibilities

  • Identify failure modes in frontier models.
  • Develop rigorous benchmarks that evaluate how well frontier agents perform on complex, realistic tasks requiring long-horizon reasoning and tool use in dynamic environments.
  • Train autonomous agents that can reason, plan, and act over extended time horizons.

Requirements

  • Currently pursuing an MS or PhD program in Computer Science or a related field.
  • Experience with reinforcement learning, benchmarking frontier models, or model post-training.
  • Experience with systems engineering and ability to write production-quality code.
  • Strong track record of publications.
  • High agency, move quickly, and enjoy working on open-ended research problems.

Compensation

  • $200k / year prorated to the number of hours committed (full-time or part-time).

Skills

Reinforcement LearningBenchmarksFrontier ModelsModel Post-TrainingSystems EngineeringProduction-Quality Code

Similar roles

AI Research jobs

ML Research Scientist, Prediction & Smart Agents

Build state-of-the-art ML models to predict traffic behavior for autonomous driving, using generative sequence modeling and controllable agents for planning and simulation. Requires PhD preferred, 2+ years deploying ML systems, and expertise in PyTorch and robotics ML.

194k – 291kMountain View, CAAI ResearchOn-site2+ YOEC++Python

Machine Learning Systems Research Engineer, Agent Post-training - Enterprise GenAI

Develops and optimizes post-training algorithms for agent RL platforms, focusing on LLM training, inference frameworks, and multi-agent systems. Requires 1-3 years production LLM experience, expertise in PyTorch/CUDA, RLHF/PPO, and advanced degree.

218k – 273kSan Francisco, CA +2AI ResearchOn-site1+ YOEPpoCUDA

Machine Learning Research Engineer, Agents - Enterprise GenAI

Develops and deploys state-of-the-art ML models and agents for enterprise GenAI using RL training and post-training algorithms. Requires 1-3 years LLM production experience, RLHF expertise, recent top publications, and advanced CS degree.

218k – 273kSan Francisco, CA +2AI ResearchOn-site1+ YOEPpoLLMs

AI Researcher

Advances AI products through post-training SOTA LLMs using supervised and reinforcement learning techniques on rich query datasets. Owns data pipelines, training frameworks, and model integration while collaborating across teams. Requires 2-6+ years in large-scale LLMs and Python/PyTorch expertise; PhD preferred.

220k – 485kSan Francisco, CA +1AI ResearchOn-site2+ YOESftDpo

Machine Learning Research Scientist: Generative Modeling for Planning

Develops state-of-the-art generative models like diffusion and flow-matching for autonomous planning in self-driving tech. Requires PhD or MSc with 2-3 years experience in generative modeling for robotics, strong Python/C++ skills, and top research publications.

160k – 241kMountain View, CAAI ResearchOn-site2+ YOEC++LLMs