Senior Machine Learning Operations Engineer
Build and operate Mercury's real-time ML inference platform for fraud risk decisioning. Own model deployment, observability, and lifecycle tooling with strong backend Python fundamentals.
Responsibilities
- Build and operate the real-time inference service that scores models for the risk decision engine, with low latency and high availability as first-class requirements
- Own model deployment infrastructure — registry and versioning, CI/CD with performance, bias, and consistency checks, shadow mode, and staged rollouts
- Build model observability: availability, latency, and error monitoring, plus drift detection as a retraining trigger
- Partner with Risk Data Science to take models from a clean development-to-production handoff through to production operation under MLP ownership
- Implement experimentation capabilities such as champion/challenger and canary routing, and explainability outputs like SHAP attributions
- Feel a strong sense of product ownership and actively seek responsibility — self-organize on small and medium projects, and help shape and build a brand-new platform team
Requirements
- 5+ years in machine learning engineering, backend software engineering, MLOps, or a closely related field
- Production ML service experience — deploying, serving, and operating models in low-latency, high-availability contexts
- Strong backend engineering fundamentals in Python, with API frameworks like FastAPI or Flask
- Experience with model deployment and lifecycle tooling: model registries, CI/CD for models, versioning, and staged rollout patterns (shadow, canary, champion/challenger)
- Experience building observability and alerting for production services — latency, errors, and ideally model-specific signals like drift
- Comfort with the data layer ML depends on: SQL, key-value/low-latency stores (Redis, DynamoDB, or equivalent), and streaming pipelines (Kafka, Kinesis, Redpanda, or equivalent)
Nice to Have
- Familiarity with a modern data stack (Snowflake, dbt, Dagster, Airflow, or similar)
- Experience operating in a regulated, audit-sensitive, or compliance-adjacent environment
- Exposure to functional languages or willingness to work across a stack that includes Haskell, React, and TypeScript
AI Engineer, Evaluation
Design and implement evaluation frameworks and pipelines for AI systems using Evaluation-Driven Development. Build Python-based test suites, LLM graders, and measurement systems that guide prompt iteration and production deployment decisions.
Senior AI Engineer
Senior Engineer building multi-agent AI systems, LLM integrations, and backend automation services that power Marketing Operations. Owns technical direction for agentic infrastructure connecting models to business systems.
Senior Machine Learning Engineer
Build and deploy cutting-edge Agentic AI and LLM systems to transform Airbnb's customer service experience, including Chat and Voice AI assistants. Requires 6+ years experience with production ML/AI systems at scale.