Multimodal AI Engineer, Document Understanding

180k – 250kSan Francisco, CAML EngineeringHybrid3+ YOENov 21

Summary

Develops and optimizes ML models for document understanding, focusing on computer vision, NLP, and multimodal processing for parsing complex documents like PDFs and spreadsheets at scale. Requires 3-7 years ML engineering experience with production Python and model training.

About the role

Responsibilities

Develop, train, and optimize machine learning models for document structure understanding, table extraction, layout analysis, and multimodal content processing
Build robust data pipelines, evaluation frameworks, and experimentation infrastructure
Design and implement production ML systems that handle complex, real-world documents at scale
Stay current with latest advances in vision-language models, document AI, and multimodal learning
Collaborate with engineering teams to integrate ML innovations into production APIs
Contribute to both our open-source frameworks and enterprise offerings
Drive technical decisions while balancing research exploration with product delivery

Required Qualifications

3-7 years of experience in machine learning engineering or applied research
Strong software engineering fundamentals with production Python experience (modern tooling: uv, ruff, mypy, Pydantic)
Hands-on experience training, fine-tuning, or deploying ML models in production
Deep understanding of modern ML techniques, particularly in computer vision, NLP, or multimodal learning
Experience with at least one of: data pipeline development, model training/fine-tuning, or ML infrastructure
Ability to read and implement from research papers and technical specifications
Track record of executing with high intensity in fast-paced environments
Strong technical communication skills and comfort with open-source collaboration

Preferred Qualifications

Experience with vision-language models, transformer architectures, or model fine-tuning (LoRA, QLoRA)
Experience building evaluation frameworks, benchmarks, or data quality pipelines
Experience with model serving frameworks (vLLM, TensorRT, ONNX) or MLOps tools
Experience specifically with document understanding, OCR, or layout analysis
Contributions to open-source ML projects or frameworks
Experience with LLM applications and RAG systems
Strong understanding of model optimization techniques (quantization, distillation, pruning)
Experience with Docker/Kubernetes and distributed systems
Active participation in ML research community

Skills

PythonMachine LearningComputer VisionNLPMultimodal LearningVision-Language ModelsTransformersLoRAQLoRAPydanticDockerKubernetesvLLMTensorRTONNX

Similar roles at this salary range

All ML Engineering jobs →

Zoox

Jun 24

Machine Learning Engineer - Simulation Framework

Machine Learning Engineer focused on GPU-based simulation frameworks, reinforcement learning, and bridging sim-to-real gaps for autonomous vehicle safety validation. Requires MS/PhD and strong C++/Python experience.

151k – 257kFoster City, CA +1ML EngineeringHybrid7+ YOEJAXC++

Talkiatry

Jun 24

Senior AI Engineer

Build full-stack AI systems including agentic workflows, RAG pipelines, and production infrastructure for mental healthcare applications. Requires 2+ years software engineering experience and 1+ year with LLMs or agentic AI.

170k – 195kUnited StatesML EngineeringRemote2+ YOERAGReact

Grafana Labs

Jun 24

Staff AI Engineer

Staff AI Engineer building and shipping LLM/agent-powered observability features for incident detection, triage, and resolution. Requires strong production software engineering experience plus practical GenAI/LLM application skills.

175k – 220kUnited StatesML EngineeringRemote7+ YOEAWSGCP

Airbnb

Jun 23

Staff Machine Learning Engineer

Build and deploy cutting-edge ML and Generative AI systems to transform Airbnb's customer support experience, focusing on LLM fine-tuning, RAG, and intelligent service automation.

212k – 260kSan Francisco, CAML EngineeringRemote9+ YOELLMRAG

Jun 23

Staff Software Engineer, Trends Machine Learning Infrastructure

Lead technical direction for Pinterest's unified AI-powered Trends and Audience Insights platform. Architect scalable ML data pipelines and LLM capabilities while mentoring engineers and driving cross-team integrations.

177k – 365kSan Francisco, CAML EngineeringHybrid8+ YOELLMsCodex

Apply