Machine Learning Research Engineer, Agents - Enterprise GenAI

Develops and deploys state-of-the-art ML models and agents for enterprise GenAI using RL training and post-training algorithms. Requires 1-3 years LLM production experience, RLHF expertise, recent top publications, and advanced CS degree.

218k – 273kSan Francisco, CANew York, NYSeattle, WAAI ResearchOnsite1+ YOE

Apply

About the role

Responsibilities

Train state-of-the-art models (internal and community-developed) for enterprise deployment.
Research and integrate cutting-edge algorithms into the training stack.
Build agents using proprietary algorithms to optimize datasets, including tools, multi-agent systems, and complex rewards.

Requirements

1-3 years building LLMs in production environments.
Experience with post-training methods (RLHF/RLVR, PPO/GRPO).
Publications in top conferences (NeurIPS, ICLR, ICML) within last 2 years.
PhD or Master's in Computer Science or related field.

Skills

LLMsRLHFRlvrPpoGrpoReinforcement LearningMulti-Agent SystemsAgent Rl TrainingPost-Training Algorithms

Similar roles

AI Research jobs

Scale AI

Machine Learning Systems Research Engineer, Agent Post-training - Enterprise GenAI

Develops and optimizes post-training algorithms for agent RL platforms, focusing on LLM training, inference frameworks, and multi-agent systems. Requires 1-3 years production LLM experience, expertise in PyTorch/CUDA, RLHF/PPO, and advanced degree.

218k – 273kSan Francisco, CA +2AI ResearchOn-site1+ YOEPpoCUDA

Perplexity

AI Researcher

Advances AI products through post-training SOTA LLMs using supervised and reinforcement learning techniques on rich query datasets. Owns data pipelines, training frameworks, and model integration while collaborating across teams. Requires 2-6+ years in large-scale LLMs and Python/PyTorch expertise; PhD preferred.

220k – 485kSan Francisco, CA +1AI ResearchOn-site2+ YOESftDpo

Polymath

AI Research Resident

AI Research Resident collaborates on research projects developing benchmarks and environments for long-horizon AI agents, identifying model failure modes, and training autonomous agents. Requires current MS/PhD enrollment, RL experience, systems engineering, and strong publications.

200k – 200kSan Francisco, CAAI ResearchRemoteEntry levelBenchmarksFrontier Models

Nuro

ML Research Scientist, Prediction & Smart Agents

Build state-of-the-art ML models to predict traffic behavior for autonomous driving, using generative sequence modeling and controllable agents for planning and simulation. Requires PhD preferred, 2+ years deploying ML systems, and expertise in PyTorch and robotics ML.

194k – 291kMountain View, CAAI ResearchOn-site2+ YOEC++Python

Nuro

Machine Learning Research Scientist: Generative Modeling for Planning

Develops state-of-the-art generative models like diffusion and flow-matching for autonomous planning in self-driving tech. Requires PhD or MSc with 2-3 years experience in generative modeling for robotics, strong Python/C++ skills, and top research publications.

160k – 241kMountain View, CAAI ResearchOn-site2+ YOEC++LLMs