Skip to content

Staff Applied Scientist - Agentic Interfaces

276k – 345kNew York, NYOnsite10+ YOE
Summary

As a Staff Applied Scientist, you will define and build measurement systems for AI agent interfaces at Datadog, focusing on evaluation strategy, metric definition, and dataset creation to improve agent performance on customer workflows.

About the role

What You’ll Do:

  • Own the evaluation strategy for Datadog's AI agent integrations. Define the metrics — offline and online, quality and cost, single-turn and trajectory-level — that the team and the broader organization optimize against.
  • Build the eval datasets, golden traces, and regression harnesses that catch quality changes before they hit customers, and make those assets reusable by every team contributing tools to the platform.
  • Drive measurable improvements to retrieval relevance, tool-selection accuracy, and context efficiency, partnering closely with the AI engineers on the team who build the underlying platform.
  • Run applied research on the open problems in agent–data interaction: tool selection under large catalogs, multi-turn agent evaluation, grounding and hallucination control on live telemetry, cost/quality tradeoffs at scale.
  • Partner with the Bits SRE, Bits Assistant, and Bits Dev Agent teams so first-party agents benefit from the same measurement substrate as third-party integrations, and so learnings move freely in both directions.
  • Provide technical leadership across the Agentic Interfaces team and the broader organization through design reviews, working groups, and mentorship, and represent the team externally through talks, blog posts, and contributions to the open agent ecosystem.

Who You Are:

  • You have a BS/MS/PhD in a scientific field, or equivalent experience.
  • 10+ years of relevant engineering or applied science experience, including time as a technical lead.
  • Proven track record of leading ML or GenAI initiatives in a product-driven environment, from research through production.
  • Significant experience with evaluation, experimentation, or measurement of ML systems at scale.
  • You bring a strong product mindset and are comfortable driving initiatives across cross-functional teams.
  • You thrive in ambiguity and can make sound technical calls when the path isn’t yet defined.

Benefits and Growth:

  • New hire stock equity (RSUs) and employee stock purchase plan (ESPP)
  • Continuous professional development, product training, and career pathing
  • An inclusive company culture, giving programs, and the ability to join our Community Guilds (Datadog employee resource groups)
  • Competitive global benefits and global Spring Health benefits for employees and dependents age 6+
  • #LI-OnsiteDatadog offers a competitive salary and equity package, and may include variable compensation. Actual compensation is based on factors such as the candidate's skills, qualifications, and experience. In addition, Datadog offers a wide range of best in class, comprehensive and inclusive employee benefits for this role including healthcare, dental, parental planning, and mental health benefits, a 401(k) plan and match, paid time off, fitness reimbursements, and a discounted employee stock purchase plan.

The reasonably estimated yearly salary for this role at Datadog is: $276,000—$345,000 USD

Skills
ML SystemsGenAIEvaluationExperimentationMeasurementApplied ResearchTechnical LeadershipProduct Mindset
Similar roles at this salary range
All AI Research jobs →
Snowflake

Staff Research Scientist, Exotic AI

Build next-generation training infrastructure for physical AI models that perceive, reason, and act in structured environments. Lead development of representation models, latent world models, and policy optimization systems.

236k – 339kBellevue, WAAI ResearchOn-site8+ YOEJAXPyTorch
Anthropic

Research Scientist, Life Sciences

Anthropic is seeking a Research Scientist to join their Life Sciences team. This role involves building and shipping agentic tools, designing evaluation benchmarks, and partnering with external users to improve model capabilities on scientific tasks.

300k – 320kSan Francisco, CAAI ResearchHybrid5+ YOELLMsRLHF
Luma AI

Simulation Researcher/Engineer

As a Simulation Researcher/Engineer, you will design and build simulation environments for training general-purpose robot policies. This role involves working with generative models and classical physics simulation, developing differentiable pipelines, and driving asset generation.

250k – 450kLos Angeles, CA +2AI ResearchHybridC++PhysX
Luma AI

Research Scientist - World Model

As a Research Scientist on the World Models team, you will invent next-generation world model architectures with a focus on controllability and physical consistency, develop controllability mechanisms, and define and own metrics for physical fidelity and action-following.

250k – 450kLos Angeles, CA +2AI ResearchHybridPyTorchRobotics
Airbnb

Principle Engineer -In Bayesian, Large Foundational Systems, and Distributional Reinforcement Learning

Lead advanced research and development of cutting-edge AI models with deep expertise in Bayesian Learning and Distributional Reinforcement Learning. This role involves architecting and integrating foundational Bayesian frameworks with advanced architectures and large language models to redefine personalization and decision-making.

296k – 370kUnited StatesAI ResearchRemote15+ YOEC++Java