AI Engineer (Core)

180k – 250kSan Francisco, CAML EngineeringOnsiteMay 7

Summary

Builds core infrastructure for production AI agents including runtime, evaluation systems, retrieval, tool orchestration, observability, and reliability features for high-stakes real estate workflows. Requires strong systems engineering with Python, backend, and LLM experience.

About the role

What you’ll do

Build the core agent platform used by product engineers to create, run, evaluate, debug, and deploy AI workflows.
Design infrastructure for long-running agents, tool orchestration, workflow state, retries, fallbacks, human handoff, and resumability.
Build context and retrieval systems that help agents use the right documents, structured data, prior decisions, project state, and tool outputs.
Create eval infrastructure for agent behavior, document understanding, groundedness, workflow completion, visual reasoning, latency, cost, and regressions.
Build observability systems for traces, prompts, model versions, tool calls, intermediate reasoning artifacts, failure modes, human overrides, and production quality metrics.
Improve the reliability of LLM-powered systems through deterministic checks, structured outputs, validation layers, guardrails, monitoring, and failure recovery.
Partner with product engineers to turn repeated workflow patterns into reusable primitives, SDKs, templates, and platform capabilities.
Evaluate and integrate models, agent frameworks, retrieval techniques, multimodal capabilities, and AI infrastructure tools.
Own performance, scalability, security, and maintainability across the AI platform.
Help define the engineering standards for production agent systems at Build.

What success looks like

In your first few months, you will improve the reliability and velocity of Build’s agent platform in ways product engineers can feel. That may mean a stronger eval harness, better traces, safer tool execution, more reliable context assembly, lower latency, lower cost, or fewer production regressions.

Over the longer term, your work will make it possible for Build to ship increasingly autonomous workflows while preserving trust, auditability, security, and operational control.

You may be a fit if

You are a strong systems engineer who wants to build infrastructure for production AI agents.
You have deep experience with backend systems, distributed systems, data systems, workflow engines, observability, or developer platforms.
You are fluent in Python and comfortable designing reliable APIs, services, queues, workers, storage models, and execution systems.
You have built with LLM APIs, tool calling, structured outputs, RAG, evals, tracing, or agent frameworks.
You care about reliability, debuggability, latency, cost, safety, and maintainability.
You think in interfaces, abstractions, failure modes, and long-term platform leverage.
You can separate what should be product-specific from what should become a reusable platform primitive.
You move fast, but you care about the engineering discipline needed to make fast teams safe.

Bonus points

Experience with agentic frameworks, LLMs, workflow engines, vector databases, reranking, model gateways, or AI observability tools.
Experience building eval systems, trace replay systems, regression infrastructure, prompt/model versioning, or LLM quality dashboards.
Experience with document AI, multimodal systems, structured extraction, citation systems, or knowledge graph infrastructure.
Experience designing permission systems, sandboxed tools, policy engines, secure execution layers, or audit trails for AI systems.
Experience supporting product teams through internal SDKs, frameworks, platform abstractions, or developer tooling.

Skills

PythonLLM APIsTool CallingStructured OutputsRAGAgent FrameworksVector DatabasesEval SystemsObservabilityDistributed Systems

Similar roles at this salary range

All ML Engineering jobs →

Zoox

Jun 24

Machine Learning Engineer - Simulation Framework

Machine Learning Engineer focused on GPU-based simulation frameworks, reinforcement learning, and bridging sim-to-real gaps for autonomous vehicle safety validation. Requires MS/PhD and strong C++/Python experience.

151k – 257kFoster City, CA +1ML EngineeringHybrid7+ YOEJAXC++

Talkiatry

Jun 24

Senior AI Engineer

Build full-stack AI systems including agentic workflows, RAG pipelines, and production infrastructure for mental healthcare applications. Requires 2+ years software engineering experience and 1+ year with LLMs or agentic AI.

170k – 195kUnited StatesML EngineeringRemote2+ YOERAGReact

Grafana Labs

Jun 24

Staff AI Engineer

Staff AI Engineer building and shipping LLM/agent-powered observability features for incident detection, triage, and resolution. Requires strong production software engineering experience plus practical GenAI/LLM application skills.

175k – 220kUnited StatesML EngineeringRemote7+ YOEAWSGCP

Airbnb

Jun 23

Staff Machine Learning Engineer

Build and deploy cutting-edge ML and Generative AI systems to transform Airbnb's customer support experience, focusing on LLM fine-tuning, RAG, and intelligent service automation.

212k – 260kSan Francisco, CAML EngineeringRemote9+ YOELLMRAG

Jun 23

Staff Software Engineer, Trends Machine Learning Infrastructure

Lead technical direction for Pinterest's unified AI-powered Trends and Audience Insights platform. Architect scalable ML data pipelines and LLM capabilities while mentoring engineers and driving cross-team integrations.

177k – 365kSan Francisco, CAML EngineeringHybrid8+ YOELLMsCodex

Apply