Technical Lead, Evaluation Infrastructure
Lead the Evaluation Infrastructure team building metrics, evaluation pipelines, and validation platforms for autonomous vehicle safety and iteration. Requires 4+ years experience in distributed systems and ML evaluation, plus strong Python/C++ skills and AI-native engineering practices.
Responsibilities
- Build and own a unified metrics, evaluation, and validation platform — pipelines, introspection tooling, and analysis products that turn on-road and simulation logs into high-fidelity signals for autonomy iteration and driverless safety validation
- Drive the technical bar for metric quality across both heuristic and ML-based approaches
- Invest in the scale, reliability, and CI/CD of the evaluation stack to shorten time-to-signal for evaluation and time-to-confidence for validation, and to meet high SLAs for downstream stakeholders
- Mentor and grow the Evaluation Infrastructure team, and champion AI-native engineering practices that compound team velocity and code quality
- Partner with Product, Autonomy, Systems & Safety, and Simulation teams to define and execute the vision and strategy for evaluation
Requirements
- B.Sc or M.Sc. degree plus 4 years of relevant work experience
- Strong fluency in distributed systems, large-scale data and ML evaluation pipelines, metrics frameworks (heuristic and/or ML-based), and analytics platforms
- Experience setting technical vision, roadmap, and prioritization for a team operating at the intersection of autonomy, safety, and data infrastructure
- Clear, concise communicator who partners effectively with PMs, engineers, and cross-functional stakeholders
- Ability and willingness to deep-dive into implementation
- Sets the technical bar for metric quality, pipeline rigor, and safety-critical engineering practice
- Strong proficiency in Python, C++, or similar languages
- Daily user of modern AI coding assistants and agentic tools (Claude Code, Cursor, and similar), with strong intuition for where they accelerate engineering work
Nice-to-Haves
- Knowledge of data engineering tooling and best practices
- Knowledge of batch and streaming data processing, warehousing, and analytics solutions
- Experience with data workflow orchestration platforms
- Prior experience building evaluation, validation, or analytics platforms, ideally in autonomy, robotics, or safety-critical systems
Compensation & Benefits
- Base pay range: $193,930 - $291,150/year
- Annual performance bonus and equity
- Competitive benefits package
Senior Machine Learning Operations Engineer
Build and operate Mercury's real-time ML inference platform for fraud risk decisioning. Own model deployment, observability, and lifecycle tooling with strong backend Python fundamentals.
Machine Learning Engineer - Embedded Insights
Drive ML initiatives from concept to production on the Embedded Insights team. Identify opportunities, build and deploy models using Plaid's financial datasets, and partner with product teams to deliver scalable customer-facing intelligence products.
Machine Learning Engineer
Advance Plaid’s foundation models by developing novel architectures, pretraining objectives, and fine-tuning strategies. Work across the full ML stack from data engineering to production serving and monitoring.