Senior AI Engineer
188k – 242kNew York, NYHybrid6+ YOE
Summary
Senior AI Engineer building production LLM tools and pipelines, designing evals, and instrumenting observability. Requires 6+ years software engineering and 2+ years hands-on LLM production experience.
About the role
What You Will Work On
- Drop into business functions to identify where AI has the highest leverage, define what to build, and own the full arc from scoping through handoff
- Spot and resolve AI-specific systems gaps across the org - missing shared infrastructure, fragmented adoption patterns, absent measurement benchmarks - and move on them faster than the rest of the org could
- Scan externally for emerging AI capabilities and market patterns; translate signal into written briefs that inform what the team experiments on next
- Build AI-powered tools and pipelines using frontier models, agent frameworks, and LLM infrastructure as core building blocks, not experiments
- Design and implement evaluation frameworks (evals) to measure whether AI systems are performing as intended — not just technically functional but meaningfully useful
- Instrument AI systems for observability: logging, tracing, cost tracking, latency monitoring, and anomaly detection using tools like Datadog, LangSmith, or equivalent
- Navigate cross-functional ownership dynamics: identify the right stakeholders, build alignment, and hand off in a way that sticks
- Represent AI Frontiers in cross-functional forums and executive reviews as a credible voice on what AI can and can't do
Qualifications
- 6+ years of professional hands-on software engineering experience
- Full-stack or backend engineering depth - shipped and maintained production systems, not just prototypes
- 2+ years, hands-on experience building with LLMs and AI tooling in production: agent frameworks, pipelines, retrieval systems, or AI-integrated workflows that real users depend on
- Experience designing and running evals: you know how to define what "working" means for an AI system and build the measurement scaffolding to prove it
- Familiarity with LLM observability tooling (LangSmith, Datadog LLM monitoring, or equivalent): you instrument what you build and use signal to improve it
- Track record of working across functions: partnered with non-engineering stakeholders, understood their problem domain, built something that fit
- Strong written and verbal communication with non-technical audiences — you can explain a technical tradeoff to an ops lead without losing them
- Comfort with ambiguity: you scope your own work, write your own briefs, and don't need a detailed spec to get started
Nice to Have
- Practical machine learning experience: familiarity with model behavior, data quality, iteration loops, and how those fundamentals translate into building reliable AI systems
Compensation
- The base wage range for this position based in our New York City Office is targeted at $188,000.00 - $242,000.00 per year.
Skills
PythonLLMsAgent FrameworksLangSmithDatadogEvaluation FrameworksObservabilityBackend EngineeringFull-Stack EngineeringRetrieval Systems
Similar roles at this salary range
All ML Engineering jobs →Staff Machine Learning Engineer
Staff ML Engineer leading end-to-end identity verification ML systems including document authenticity, face matching, liveness detection, GNN-based identity graphs, and behavioral risk models. Requires 8+ years production ML experience and domain expertise in biometrics or fraud detection.
218k – 257kUnited StatesML EngineeringRemote8+ YOENLPLLMs
Staff ML Engineer
Founding Staff ML Engineer building production ML systems for governance, security, and agentic platform capabilities at Docker. Owns architecture, data pipelines, evaluation, and model lifecycle while mentoring the growing team.
205k – 330kPalo Alto, CA +1ML EngineeringRemote8+ YOELLMsRetrieval