Skip to content

Applied Research Engineer, Agents

250k – 300kSan Francisco, CAHybrid3+ YOE
Summary

Develops frameworks, data pipelines, and benchmarks for autonomous AI agents using SFT and RL. Collaborates with frontier AI labs, publishes research, and requires Master's/PhD plus 3+ years ML experience with deep learning proficiency.

About the role

Responsibilities

  • Create frameworks and tools to construct, train, benchmark and evaluate autonomous agent capabilities.
  • Design agent-focused data programs using supervised fine-tuning (SFT) and reinforcement learning (RL) methodologies.
  • Develop data pipelines from diverse sources like code repositories, web browsers, and computer systems.
  • Implement and adapt popular open-source agent libraries and benchmarks with proprietary datasets and models.
  • Engage with research teams in frontier AI labs and the wider AI community to understand evolving agent data needs for frontier models and share best practices.
  • Collaborate closely with frontier AI lab customers to understand requirements and guide model development.
  • Publish research findings in academic journals, conferences, and blog posts.

Requirements

  • Ph.D. or Master's degree in Computer Science, Machine Learning, AI, or related field.
  • At least 3 years of experience addressing sophisticated ML problems with successful delivery to customers.
  • Experience building and training autonomous agents—tool use, structured outputs, multi-step planning—across browsers/GUI, codebases, and databases using SFT and RL.
  • Constructed and evaluated agentic benchmarks (e.g. SWE-bench, WebArena, τ-bench, OSWorld) and reliability/efficiency suites (e.g. WABER).
  • Adept at interpreting research literature and quickly turning new ideas into prototypes.
  • Deep understanding of frontier models (autoregressive, diffusion), post-training (SFT, RLVR, RLAIF, RLHF, et al.), and their human data requirements.
  • Proficient in Python, data science libraries and deep learning frameworks (e.g., PyTorch, JAX, TensorFlow).
  • Strong analytical and problem-solving abilities in ambiguous situations.
  • Excellent communication skills.
  • Track record of publications in top-tier AI/ML venues (e.g., ACL, EMNLP, NAACL, NeurIPS, ICML, ICLR, etc.).

Compensation

  • Annual base salary range: $250,000—$300,000 USD
Skills
PythonPyTorchJAXTensorFlowLLM agentsSFTRLSWE-benchWebArenaOSWorld
Similar roles at this salary range
All AI Research jobs →
Upstart

Principal Applied Scientist

As a Principal Applied Scientist, you will define the technical direction for offer optimization and conversion modeling systems, working across teams to integrate models and optimization systems. This role involves structuring ambiguous problems, designing solutions, and providing technical oversight to ensure a coherent long-term vision.

220k – 330kUnited StatesAI ResearchRemoteFintechStatistics
Anthropic

Research Scientist, Life Sciences

Anthropic is seeking a Research Scientist to join their Life Sciences team. This role involves building and shipping agentic tools, designing evaluation benchmarks, and partnering with external users to improve model capabilities on scientific tasks.

300k – 320kSan Francisco, CAAI ResearchHybridLLMsRLHF
Luma AI

Simulation Researcher/Engineer

As a Simulation Researcher/Engineer, you will design and build simulation environments for training general-purpose robot policies. This role involves working with generative models and classical physics simulation, developing differentiable pipelines, and driving asset generation.

250k – 450kLos Angeles, CA +2AI ResearchHybridC++PhysX
Luma AI

Research Scientist - World Model

As a Research Scientist on the World Models team, you will invent next-generation world model architectures with a focus on controllability and physical consistency, develop controllability mechanisms, and define and own metrics for physical fidelity and action-following.

250k – 450kLos Angeles, CA +2AI ResearchHybridPyTorchRobotics
Airbnb

Principle Engineer -In Bayesian, Large Foundational Systems, and Distributional Reinforcement Learning

Lead advanced research and development of cutting-edge AI models with deep expertise in Bayesian Learning and Distributional Reinforcement Learning. This role involves architecting and integrating foundational Bayesian frameworks with advanced architectures and large language models to redefine personalization and decision-making.

296k – 370kUnited StatesAI ResearchRemoteC++Java