Senior Applied Research Engineer
167k – 226kSan Francisco, CAML EngineeringHybrid5+ YOE
Summary
Senior Applied Research Engineer driving AI system quality through experimentation and evaluation of RAG, retrieval, and reasoning systems. Requires 5+ years applied ML/NLP experience with strong Python and evaluation methodology skills.
About the role
Responsibilities
- Design and evaluate information access + reasoning strategies across RAG, agents, and classic ML: chunking, embedding models, hybrid search, metadata filtering, semantic routing
- Prototype GenAI workflows (including agentic systems) that map and reason over compliance objects (controls ↔ risks ↔ requirements ↔ evidence)
- Explore ML + probabilistic approaches where GenAI is not the best fit: classifiers, ranking models, graph/link prediction, calibration, and structured prediction
- Build and maintain evaluation frameworks: golden datasets, automated quality metrics, regression detection
- Implement and tune ranking/reranking systems: cross-encoders, LLM-based rerankers, learning-to-rank, custom scoring functions
- Run experiments to validate hypotheses and quantify improvements before production rollout
- Debug failure modes and build error taxonomies across retrieval, reasoning, and generation
- Collaborate with AI and Software Engineers to hand off validated approaches for productionization
- Stay current on applied research in RAG, agents, LLM evaluation, and relevance modeling; bring innovations into the product
Requirements
- 5+ years of experience in applied research, data science, or ML with a focus on NLP, information retrieval, or knowledge systems
- 2+ years of hands-on experience building or contributing to production AI/ML systems
- Strong foundation in information retrieval: dense and sparse retrieval, embedding models, search relevance
- Experience with RAG systems: chunking strategies, vector databases, retrieval optimization
- Proficiency in evaluation methodology: metrics design, golden dataset creation, A/B testing, statistical significance
- Strong Python skills and comfort with notebook-driven research workflows
- Experience communicating research findings to engineering teams and translating insights into actionable improvements
Nice-to-Haves
- Experience with compliance, legal, or document-heavy domains
- Publications or contributions in IR, NLP, or RAG evaluation
Compensation & Benefits
- Competitive base salary: $166,900 - $225,900
- Stock equity (RSUs)
- Up to 100% employer-paid medical, dental, and vision premiums
- 401(k) plan, company-paid life and disability insurance
- Paid Parental Leave (after 6 months)
- Kindbody fertility and family-building benefits
- Generous annual professional and personal development stipends
- Flexible vacation policy and paid holidays
Skills
PythonRAGInformation RetrievalNLPMachine LearningVector DatabasesEvaluation MetricsA/B TestingStatistical AnalysisEmbedding Models
Similar roles at this salary range
All ML Engineering jobs →Senior Machine Learning Operations Engineer
Build and operate Mercury's real-time ML inference platform for fraud risk decisioning. Own model deployment, observability, and lifecycle tooling with strong backend Python fundamentals.
167k – 208kSan Francisco, CA +2ML EngineeringHybrid5+ YOESQLSHAP
AI Engineer, Evaluation
Design and implement evaluation frameworks and pipelines for AI systems using Evaluation-Driven Development. Build Python-based test suites, LLM graders, and measurement systems that guide prompt iteration and production deployment decisions.
150k – 250kSan Francisco, CA +1ML EngineeringHybrid2+ YOEPythonAI Systems