Skip to content

Research Scientist, Agent Robustness

Research Scientist focuses on agent robustness, developing tests, exploits, and mitigations for safe AI agents. Requires 3+ years ML experience, RL techniques like RLHF/DPO, and published research in generative AI.

197k – 247kSan Francisco, CANew York, NYAI ResearchHybrid3+ YOE

About the role

Responsibilities

  • Research the science of AI agent capabilities with a focus on safety, risk factors, and benchmarking methodologies.
  • Design and build harnesses to test AI agents’ tendency to take harmful actions when pressured or tricked.
  • Design and build exploits and mitigations for failure modes arising from agent affordances like coding, web browsing, and computer use.
  • Characterize and design mitigations for failure modes or risks in systems with multiple interacting AI agents.

Requirements

  • Commitment to promoting safe, secure, and trustworthy AI deployments.
  • Practical experience conducting technical research collaboratively, including building agent scaffolding, designing evaluation harnesses, and prototyping research ideas.
  • Experience with post-training and RL techniques such as RLHF, DPO, GRPO.
  • Track record of published research in machine learning, particularly generative AI.
  • At least 3 years of experience addressing sophisticated ML problems.
  • Strong written and verbal communication skills.

Nice to Have

  • Hands-on experience with agent evaluation frameworks such as SWE-bench, WebArena, OSWorld, Inspect.
  • Experience with red-teaming, prompt injection, or adversarial testing of AI systems.

Skills

RLHFDpoGrpoSwe-BenchWebarenaOsworldInspectRed-TeamingPrompt InjectionAdversarial TestingGenerative AIMachine LearningAgent EvaluationRl Techniques

Similar roles

AI Research jobs

Research Scientist, Frontier Risk Evaluations

Designs evaluation measures, harnesses, and datasets to assess risks from frontier AI systems, including dangerous capabilities testing. Collaborates with agencies, publishes methodologies for policymakers; requires 3+ years ML experience and publications in generative AI.

197k – 247kSan Francisco, CA +2AI ResearchOn-site3+ YOELLMsAi Safety

Research Scientist, AI Controls and Monitoring

Designs methods, systems, and experiments for AI controls and monitoring to ensure alignment in high-stakes environments, including real-time tracking, fail-safes, and red-team simulations. Requires 3+ years ML experience, published research in generative AI, and strong prototyping skills.

197k – 247kSan Francisco, CA +1AI ResearchHybrid3+ YOEDpoRLHF

Lead Quantum Device Theorist

Leads theoretical modeling of superconducting quantum processors, focusing on noise sources, gate operations, and error correction to enhance qubit performance. Requires PhD in Physics or related field with 5+ years experience in circuit QED and quantum simulations.

195k – 225kBerkeley, CA +1AI ResearchOn-site5+ YOEStimQutip

Research Scientist

Leads original research in action-conditioned world models, physical AI, and generative modeling for embodied systems. Requires PhD in ML/CS/Robotics with top publications and expertise in generative models and large-scale training.

200k – 325kSan Francisco, CAAI ResearchOn-siteDpoRLHF

AI Researcher, Core ML (Turbo)

Develops efficient inference engines and RL/post-training pipelines for production-scale LLMs, optimizing algorithms, systems, and performance across the stack. Requires 3+ years in ML systems/RL/inference and advanced degree.

200k – 280kSan Francisco, CAAI ResearchOn-site3+ YOEDpovLLM