Staff ML Engineer, AI Platform
Builds ML platform infrastructure including evaluation/release gates, debug tooling, chart context retrieval, data pipelines, and model serving to accelerate AI improvements for clinical workflows. Requires 7+ years software engineering with 3+ in ML infra/platform, strong Python/TypeScript backend skills.
What You’ll Own
Eval & Release Infrastructure
- Automated graders and release gates that work across product pods
- Unified eval dataset versioning and execution to replace fragmented workflows
- Production quality monitoring with end-to-end tracing, shared metrics, and automated alerting
Debug Tooling
- Encounter replay that reconstructs exact inference inputs (retrieved chart context, packed prompts, model versions) so teams reproduce issues without digging through logs
- Diff views comparing known-good runs to regressions
Chart Context & Data Pipelines
- The retrieval layer that pulls relevant patient history and assembles it into consistent model-ready inputs
- Feedback loops that capture real-world usage and convert it into training signal
- End-to-end latency instrumentation across every workflow step
Preference Infrastructure
- The system that enables clinician and site-specific behavior across specialties
- Different clinics want different defaults, different phrasing, different workflows. Build the platform that supports customization at scale
Model Serving
- Performance and reliability layer for critical in-house models with clear SLOs, capacity planning, and regression alerts
Who You Are
- 7+ years in software engineering, 3+ focused on ML infrastructure, platform engineering, or data systems
- Staff-level scope: owned cross-cutting infrastructure, influenced technical direction across multiple teams
- Strong backend fundamentals in Python, TypeScript, or similar
- Built eval systems, data pipelines, or ML observability infrastructure
- Comfortable on both the ML and Eng sides of MLOps
- Track record of platform work that measurably accelerated other teams
- In SF, 3x/week in-person
Compensation
Base compensation range of approximately $250,000-300,000 per year, exclusive of equity.
Senior Software Engineer, AI Platform
Senior Software Engineer building scalable AI infrastructure, agent orchestration frameworks, evaluation systems, and high-performance LLM serving at Mixpanel. Requires 5+ years experience and hands-on LLM/agent work.
Senior Machine Learning Systems Engineer
Build large-scale ML experimentation and training orchestration platforms, including agentic AI execution systems, to accelerate Ads ML development at Reddit. Requires 5+ years infrastructure experience and 2+ years building production ML platforms.
Staff Software Engineer, Agentic Platform
Senior individual contributor architecting and scaling agentic LLM systems that turn messy manufacturing data into reliable root-cause insights. Owns orchestration, retrieval, evaluation, and guardrails for non-deterministic production systems.