Skip to content

Software Engineer

Build simulation environments, tasks, and verifiers to train and evaluate long-horizon autonomous AI agents using reinforcement learning. Requires strong engineering fundamentals, high agency, and focus on robust systems.

San Francisco, CAML EngineeringRemote

About the role

Responsibilities

  • Build diverse, high-fidelity environments that test agents in realistic settings
  • Design complex tasks that require long-horizon reasoning and tool use
  • Develop robust verifiers that reliably measure agent performance
  • Improve infrastructure and tooling to run, debug, and improve environments
  • Work closely with the research team to identify failure modes and turn them into new tasks and benchmarks

Requirements

  • Strong engineering fundamentals
  • Enjoy building from first principles and solving open-ended technical problems
  • High agency and a strong bias toward shipping
  • High quality bar and care about building robust systems

Skills

Reinforcement LearningSimulation EnvironmentsPythonInfrastructureToolingDebuggingVerifiersBenchmarks

Similar roles

ML Engineering jobs

Software Engineer, Agents

Design and build agentic systems for AI-native video creation, integrating LLMs and evaluation frameworks to power creative workflows. Requires 5+ years building ML/agentic systems in production.

175k – 275kNew York, NYML EngineeringOn-site5+ YOERAGLLMs

Research Scientist II

Research Scientist II building and improving fraud risk models and scam detection systems using audio, behavioral, and metadata signals. Requires an advanced degree and 3+ years of applied ML experience with Python and modern ML frameworks.

160k – 185kUnited StatesML EngineeringRemote3+ YOELLMsKeras

Research Engineer, Post-Training

Research engineer focused on post-training LLMs and agents for legal work. Requires hands-on experience training open-weight models and strong Python/research engineering skills.

231k – 340kSan Francisco, CAML EngineeringHybridSftRLHF

AI Engineer

Build full-stack AI prototypes and agentic systems to pressure-test venture ideas. Requires 3+ years building production AI applications with strong frontend/backend fluency and frontier coding agent expertise.

150k – 190kMountain View, CAML EngineeringOn-site3+ YOESQLAPIs

Machine Learning Engineer, Ads Optimization & Ads Marketplace Quality

Build and evolve auction, bidding, and budgeting ML systems that power Reddit Ads. Design optimization algorithms balancing advertiser performance, user experience, and marketplace efficiency.

186k – 303kUnited StatesML EngineeringRemote3+ YOEGoJava