Skip to content

Research Engineer, Agents

Research Engineers build and productionize agentic AI systems, designing compound architectures, evaluation frameworks, and reliable execution for enterprise workflows. Requires strong Python skills, systems reasoning, and experience building agents with tools, retrieval, planning, and memory.

150k – 250kSan Francisco, CANew York, NYML EngineeringHybrid

About the role

Key Responsibilities

  • Design, prototype, and implement agentic AI systems that perform reliably across complex enterprise workflows
  • Build compound AI architectures that combine planning, tool use, retrieval, memory, evaluation, orchestration, and execution
  • Investigate how agents reason, coordinate, recover from errors, and interact with external systems under real-world constraints
  • Develop evaluation frameworks that measure agent behavior, task completion, reliability, robustness, and failure modes
  • Create tools and abstractions that make agent behavior easier to observe, debug, test, and improve
  • Partner with AI Researchers to explore new agent architectures and with AI Engineers to harden successful approaches for production use
  • Integrate agents into customer APIs, applications, data platforms, and operational workflows
  • Communicate clearly with internal teams and customer stakeholders about agent capabilities, limitations, tradeoffs, and risks

Requirements

  • Experience building agentic systems using models, tools, retrieval, planning, memory, or multi-step execution to complete real tasks
  • Strong engineering fundamentals with clean, maintainable Python and comfort debugging complex, stateful systems
  • Systems-level reasoning about how prompts, tools, context, evaluators, state, orchestration, and external APIs interact
  • Research-oriented builder mindset: curious about why agents succeed or fail and able to design experiments to test architectures
  • AI-native working style using AI tools daily to write code, debug systems, explore designs, analyze traces, and accelerate experimentation
  • Bias towards showing vs. telling with preference for working demonstrations, traces, evaluations, and production behavior
  • Comfort in customer environments translating ambiguous business workflows into concrete agent designs and explaining system behavior to stakeholders
  • Ownership mentality taking responsibility for whether an agentic system performs reliably, safely, and usefully in production

Compensation & Benefits

  • Base salary range: $150K – $250K depending on experience, location, and level
  • Meaningful equity
  • 100% covered medical, dental, and vision for employees and dependents
  • 401(k) with additional perks (e.g., commuter benefits, in-office lunch)
  • Access to state-of-the-art models and generous usage of modern AI tools

Skills

PythonAgentic Ai SystemsTool UseRetrievalPlanningMemoryEvaluation FrameworksOrchestrationCompound Ai ArchitecturesDebugging

Similar roles

ML Engineering jobs

AI Engineer

Build full-stack AI prototypes and agentic systems to pressure-test venture ideas. Requires 3+ years building production AI applications with strong frontend/backend fluency and frontier coding agent expertise.

150k – 190kMountain View, CAML EngineeringOn-site3+ YOESQLAPIs

Machine Learning Engineer

Build and deploy ML models for entity resolution and knowledge graph expansion on large-scale China-related data. Requires 4+ years clustering ML experience and end-to-end production ML with Python/SQL.

150k – 195kNew York, NYML EngineeringHybrid4+ YOESQLNLP

Research Engineer, Post-Training

Research Engineers design and run post-training workflows, build evaluation infrastructure, and turn frontier AI techniques into reliable production systems for enterprise customers. Requires experience with fine-tuning, RLHF, reward modeling, and strong experimentation skills.

150k – 250kSan Francisco, CA +1ML EngineeringHybridEvalsPython

ML Engineer

Build and deploy production ML models and pipelines to detect suspicious activity, improve verification accuracy, and support threat intelligence workflows.

150k – 180kUnited StatesML EngineeringRemote4+ YOEAWSClustering

Algorithm Engineer

Lead biosignal algorithm development from requirements to production for medical devices, leveraging ML/DL, DSP, and statistics. Requires 4+ years industry experience bringing algorithms into production.

150k – 170kUnited StatesML EngineeringRemote4+ YOECI/CDDocker