Skip to content

Lead AI Engineer

Leads development of proprietary AI reasoning model TRAM for interpreting global trade law, building data pipelines, fine-tuning LLMs, and evaluation frameworks for high-speed, accurate compliance determinations. Requires AI product experience, especially RAG systems and model fine-tuning.

250k – 280kSan Francisco, CAML EngineeringOnsite

About the role

What You'll Do

Within weeks:

  • Lead development of new features aimed at increasing TRAM’s test-time accuracy
  • Work on the underlying data and retrieval pipelines that help power our AI workflows
  • Work directly with our internal tax experts to understand how TRAM can better reason like them

Within months:

  • Own TRAM’s eval framework and workflows
  • Work directly with leading frontier labs to reinforce fine tune models on our proprietary data

Requirements

  • Prior experience building AI enabled products, particularly RAG systems
  • Experience fine tuning base models, ideally via RF
  • Willingness to dive into tax technical problems
  • A strong understanding of how LLMs and reasoning models function

Nice to Haves

  • Experience working with LLMs on legal applications
  • Experience with RAG data pipelines and collecting/curating data for the pipeline

Skills

RAGLLMsFine-TuningRetrieval PipelinesEvaluation FrameworksData PipelinesReasoning Models

Similar roles

ML Engineering jobs

Researcher, Alignment Oversight

Designs and runs experiments to improve oversight of increasingly capable AI models, including model training, evaluation, and deployment of practical systems. Analyzes failures and develops techniques to train more aligned models using oversight signals.

250k – 445kSan Francisco, CAML EngineeringHybridLLMsPyTorch

Research Scientist / Engineer — Multimodal Agent

Builds and trains large-scale multimodal agentic models involving reasoning, planning, coding, and tool calling. Requires strong ML foundations, PyTorch expertise, and experience with distributed training on massive datasets.

250k – 450kPalo Alto, CAML EngineeringHybridVlmLLMs

Research Engineer, Evals

Build benchmarks, datasets, and evaluation systems to measure and improve AI model quality for fraud, identity, and risk judgment tasks. Collaborate across research, engineering, and product to drive rigorous experimentation and iteration in high-stakes environments.

250k – 400kSan Francisco, CAML EngineeringOn-siteLLMsPython

Research Engineer, Judgment Systems

Research Engineer designs evaluations, studies model failures, and builds research loops to improve AI agents for high-stakes fraud detection and judgment tasks. Requires ML training experience, experimental rigor, and strong engineering skills in adversarial environments.

250k – 400kSan Francisco, CAML EngineeringOn-siteLLMsPython

AI Engineer

Builds and deploys AI primitives and agents to automate workflows and enhance user experiences in investment management platform. Requires AI agent experience, distributed systems knowledge, and product-minded engineering across tech stacks.

250k – 325kNew York, NYML EngineeringOn-siteGoPython