Machine Learning Engineer

Builds product workflows and agentic systems using language models for research tasks like evidence synthesis and experiment planning. Combines ML fluency with strong software engineering to create reliable, trustworthy AI tools for scientific decision-making.

230k – 340kOakland, CAML EngineeringHybrid

Apply

About the role

Responsibilities

Build agentic harnesses for target assessment, evidence synthesis, and experiment planning that allow models to provide guarantees about their processes.
Develop data integrations across literature, scientific databases, customer data, and internal tools.
Create APIs that customers can use in their own systems.
Build evaluation systems that help understand whether changes improve user outcomes.
Implement trust and transparency features, like source-quality signals, intermediate reasoning, and ways to inspect and fix outputs.

Example Projects

Build a target-assessment workflow that combines literature, genetics, chemistry, clinical, regulatory, and company data into a shareable artifact.
Build experiment-planning and iteration tools that help researchers decide what to do next and learn from new results.
Build evidence-monitoring workflows that keep teams up to date through alerts, briefs, and living reports.
Build enterprise APIs and structured-output pipelines that plug Elicit into customers' internal systems.
Build interfaces that make it easier to inspect, trust, and correct model outputs.
Build workflow-specific evals and quality systems that tell us whether a product change actually helped users.
Improve extraction, reasoning, or search quality with better prompts, better system design, or finetuning when appropriate.

Requirements

Strong software engineering background and ability to build end-to-end systems, not just scripts or notebooks.
Fluency with language models to reason well about prompting, retrieval, evals, failure modes, and where (and how) finetuning is or isn't worth it.
Strong product sense and ability to turn fuzzy user problems into concrete things people can use.
Excitement to solve difficult, creative problems rather than narrow optimization on well-defined benchmarks.
Ability to move across backend, data, and model layers as needed.
Clear communication with product, design, domain experts, and other engineers.
Ability to use coding assistants effectively and thoughtfully.

Compensation

Career (L3): $185-230K + equity
Senior (L4): $230-260K + equity
Expert/Staff (L5): $255-340K + significant equity
Targeting senior-level (L4) or above.

Skills

PythonLanguage ModelsPromptingRetrievalEvaluationsFinetuningAPIsBackend EngineeringData IntegrationLLMs

Similar roles

ML Engineering jobs

OpenAI

AI Systems Engineer, Codex Agents

Builds core agent harness for Codex AI agents, enabling safe tool use, code execution, and long-horizon tasks in production. Designs systems for sandboxing, evaluation, observability, and performance optimization across ML workflows and infrastructure.

230k – 385kSan Francisco, CAML EngineeringOn-siteRustLLMs

Ads Conversion Modeling, Machine Learning Engineering Manager

Leads machine learning team developing conversion models for Reddit Ads, focusing on predictive modeling for user actions like purchases and signups. Requires deep ML expertise, ads domain knowledge, and 2+ years managing high-performing ML teams.

230k – 322kUnited StatesML EngineeringRemotePyTorchTensorFlow

OpenAI

Applied AI Engineer, Codex Core Agent

Develops and improves Codex AI agents for real-world software engineering tasks, focusing on performance, reliability, and integration with research and product teams. Requires strong Python, ML/LLM experience, and skills in evaluation, prompting, and debugging production failures.

230k – 325kSan Francisco, CA +2ML EngineeringOn-siteLLMsPython

OpenAI

Software Engineer, Marketing Innovation

Build and own autonomous agentic systems for customer-facing revenue and marketing workflows, partnering with sales and marketing teams. Requires 4+ years experience in software/ML engineering, full-stack skills in Python/JavaScript, and production systems expertise.

230k – 385kSan Francisco, CAML EngineeringOn-site4+ YOEAPIsPython

Harvey

Research Engineer, Post-Training

Research engineer focused on post-training LLMs and agents for legal work. Requires hands-on experience training open-weight models and strong Python/research engineering skills.

231k – 340kSan Francisco, CAML EngineeringHybridSftRLHF