# Technical Data Delivery Lead
**Company:** [Pareto AI](https://hotfix.jobs/companies/pareto-ai)
**Location:** Remote
**Salary:** $140K-$180K
**Skills:** Python, SQL, LangChain, Dspy, Autogen, RLHF, Sft, Rlvr, Red-Teaming, LLMs, Agentic Workflows
**Posted:** 2026-03-16
> Leads architecture, execution, and improvement of data collection/evaluation pipelines for AI labs, including agentic automation for quality and delivery. Requires Python/SQL proficiency, LLM internals knowledge, and hands-on agent framework experience.
## Job Description
## Responsibilities

**Pipeline architecture**
- Design end-to-end data collection and evaluation pipelines for RLVR, RLHF, SFT, red-teaming, and model evaluation workflows.
- Prototype novel workflows quickly, identify architectural risks, and make tradeoff decisions.
- Understand agent-tool interactions and communicate engineering needs.

**Agentic system deployment**
- Build, test, and iterate on AI agents for automating pipeline tasks like quality gate review, expert matching, output flagging, and anomaly detection.
- Scope agent capabilities, write prompts and evaluation logic, monitor production performance.

**Quality systems**
- Define data quality standards for annotation, evaluation, and expert output review.
- Design and run audits using inter-rater reliability metrics, calibration sets, and statistical sampling.
- Build preventive systems: automated checks, structured output validation, model-assisted review.
- Spot-check tasks and translate findings into expert guidelines.

**Client interface**
- Engage with AI researchers, TPMs, and PMs to translate requirements into workflows.
- Communicate pipeline performance, escalate risks, contribute to scoping and pricing.

**Research integration**
- Stay current with LLM post-training, evaluation methodology, and data tooling.
- Evaluate and integrate new approaches like model-assisted annotation.

## Requirements

- Proficiency in **Python** and **SQL** for data manipulation, pipeline monitoring, and quality analysis.
- Working knowledge of LLM internals: RLHF/SFT training loops, prompt structure effects, RL environment setup for agentic data collection/eval.
- Hands-on experience with at least one agentic or LLM workflow framework (**LangChain**, **DSPy**, **AutoGen**, direct tool-use via API, or equivalent).
- Demonstrated ownership of a data or ML pipeline from scoping through delivery, including quality design.
- Strong written communication for technical guidelines, rubrics, and researcher briefings.
- Comfort operating with ambiguity in fast-moving environments.

## Nice-to-haves

- Direct experience with RL environment data pipelines, evaluation framework design, and red-teaming workflows.
- Background in data engineering, ML research support or equivalent.
- Experience designing or operating agentic systems in production/near-production.
- Familiarity with inter-rater reliability methods, calibration set design, and annotation quality frameworks.
- Prior client-facing or technical program management experience in AI/ML-adjacent context.
- Experience scoping/driving projects with fuzzy specs.
**Apply:** https://hotfix.jobs/jobs/technical-data-delivery-lead-at-pareto-ai-0262b607-ec6d-4f86-92f6-09b4cb07da58
**Canonical:** https://hotfix.jobs/jobs/technical-data-delivery-lead-at-pareto-ai-0262b607-ec6d-4f86-92f6-09b4cb07da58