Skip to content

Multimodal AI Engineer, Document Understanding

180k – 250kSan Francisco, CAML EngineeringHybrid3+ YOE
Summary

Develops and optimizes ML models for document understanding, focusing on computer vision, NLP, and multimodal processing for parsing complex documents like PDFs and spreadsheets at scale. Requires 3-7 years ML engineering experience with production Python and model training.

About the role

Responsibilities

  • Develop, train, and optimize machine learning models for document structure understanding, table extraction, layout analysis, and multimodal content processing
  • Build robust data pipelines, evaluation frameworks, and experimentation infrastructure
  • Design and implement production ML systems that handle complex, real-world documents at scale
  • Stay current with latest advances in vision-language models, document AI, and multimodal learning
  • Collaborate with engineering teams to integrate ML innovations into production APIs
  • Contribute to both our open-source frameworks and enterprise offerings
  • Drive technical decisions while balancing research exploration with product delivery

Required Qualifications

  • 3-7 years of experience in machine learning engineering or applied research
  • Strong software engineering fundamentals with production Python experience (modern tooling: uv, ruff, mypy, Pydantic)
  • Hands-on experience training, fine-tuning, or deploying ML models in production
  • Deep understanding of modern ML techniques, particularly in computer vision, NLP, or multimodal learning
  • Experience with at least one of: data pipeline development, model training/fine-tuning, or ML infrastructure
  • Ability to read and implement from research papers and technical specifications
  • Track record of executing with high intensity in fast-paced environments
  • Strong technical communication skills and comfort with open-source collaboration

Preferred Qualifications

  • Experience with vision-language models, transformer architectures, or model fine-tuning (LoRA, QLoRA)
  • Experience building evaluation frameworks, benchmarks, or data quality pipelines
  • Experience with model serving frameworks (vLLM, TensorRT, ONNX) or MLOps tools
  • Experience specifically with document understanding, OCR, or layout analysis
  • Contributions to open-source ML projects or frameworks
  • Experience with LLM applications and RAG systems
  • Strong understanding of model optimization techniques (quantization, distillation, pruning)
  • Experience with Docker/Kubernetes and distributed systems
  • Active participation in ML research community
Skills
PythonMachine LearningComputer VisionNLPMultimodal LearningVision-Language ModelsTransformersLoRAQLoRAPydanticDockerKubernetesvLLMTensorRTONNX
Similar roles at this salary range
All ML Engineering jobs →
Zoox

Machine Learning Engineer - Simulation Framework

Machine Learning Engineer focused on GPU-based simulation frameworks, reinforcement learning, and bridging sim-to-real gaps for autonomous vehicle safety validation. Requires MS/PhD and strong C++/Python experience.

151k – 257kFoster City, CA +1ML EngineeringHybrid7+ YOEJAXC++
Talkiatry

Senior AI Engineer

Build full-stack AI systems including agentic workflows, RAG pipelines, and production infrastructure for mental healthcare applications. Requires 2+ years software engineering experience and 1+ year with LLMs or agentic AI.

170k – 195kUnited StatesML EngineeringRemote2+ YOERAGReact
Grafana Labs

Staff AI Engineer

Staff AI Engineer building and shipping LLM/agent-powered observability features for incident detection, triage, and resolution. Requires strong production software engineering experience plus practical GenAI/LLM application skills.

175k – 220kUnited StatesML EngineeringRemote7+ YOEAWSGCP
Airbnb

Staff Machine Learning Engineer

Build and deploy cutting-edge ML and Generative AI systems to transform Airbnb's customer support experience, focusing on LLM fine-tuning, RAG, and intelligent service automation.

212k – 260kSan Francisco, CAML EngineeringRemote9+ YOELLMRAG
Pinterest

Staff Software Engineer, Trends Machine Learning Infrastructure

Lead technical direction for Pinterest's unified AI-powered Trends and Audience Insights platform. Architect scalable ML data pipelines and LLM capabilities while mentoring engineers and driving cross-team integrations.

177k – 365kSan Francisco, CAML EngineeringHybrid8+ YOELLMsCodex