# Machine Learning Engineer, LLM Evals & Observability
**Company:** [Glean](https://hotfix.jobs/companies/glean)
**Location:** Mountain View, CA
**Salary:** $200K-$300K
**Experience:** 2+ years
**Skills:** Python, Go, Llm Evaluation, Natural Language Processing, Distributed Data Pipelines, Evaluation Pipelines, Llm-Powered Judges, Observability Infrastructure, Machine Learning
**Posted:** 2026-05-12
> Builds evaluation pipelines, LLM judges, and observability tools to measure and improve AI assistant quality. Requires 2+ years software engineering with Go/Python, LLM eval experience, and analytical rigor for backend ML infrastructure.
## Job Description
## Responsibilities
- Design and curate evaluation datasets – sampling strategies, query diversity, and golden sets that give reliable, representative coverage of real assistant behavior.
- Build and maintain large-scale evaluation pipelines that measure assistant quality across thousands of real user queries.
- Build LLM-powered judges that score metrics like correctness, completeness, and response quality, and align them against human judgment.
- Evaluate new models and product changes before they ship – providing the quality signal that gates launches and prevents regressions.
- Build observability infrastructure for AI agents: trace enrichment, data pipelines, and dashboards that make assistant behavior inspectable.
- Close the loop between quality measurement and improvement using eval results, customer feedback, and techniques like automated prompt iteration to help drive concrete gains in assistant behavior.
- Collaborate with engineers across the company to make evals a first-class part of how we ship.

## Requirements
- 2+ years of software engineering experience with strong coding skills.
- Strong backend fundamentals in **Go** and **Python**; comfortable with distributed data pipelines.
- Experience working with **LLM evaluation**, **reinforcement learning from human feedback**, **natural language processing**, or other large systems involving machine learning.
- Analytically rigorous – you think carefully about what offline metrics actually predict about real user experience.
- Thrive in a customer-focused, tight-knit and cross-functional environment - being a team player and willing to take on whatever is most impactful for the company.
- You care about quality – not just in the systems you build, but in the product you're helping measure and improve.

## Compensation & Benefits
- Base salary range: **$200,000 - $300,000** annually.
- Variable compensation, equity, and benefits eligibility.
- Comprehensive benefits: Medical, Vision, Dental, generous time-off, 401k, home office stipend, education and wellness stipends, company events, daily lunches.
**Apply:** https://hotfix.jobs/jobs/machine-learning-engineer-llm-evals-observability-at-glean-333f2e4d-dedc-4a3b-9245-370c61f06447
**Canonical:** https://hotfix.jobs/jobs/machine-learning-engineer-llm-evals-observability-at-glean-333f2e4d-dedc-4a3b-9245-370c61f06447