Senior Machine Learning Engineer, Personalization, Magenta

Designs and ships production ML systems for conversational AI personalization at Spotify, focusing on user intent interpretation, agentic workflows, context management, and scalable evaluation frameworks. Requires 5+ years experience with LLMs and production ML.

New York, NYML EngineeringRemote5+ YOE

Apply

About the role

What You'll Do

Design and ship production-grade machine learning systems powering conversational and agentic AI experiences
Build systems that interpret user intent, manage context across multi-turn interactions, and handle ambiguity reliably at scale
Develop and evolve agentic workflows including memory, context management, and multi-step tool orchestration
Create evaluation frameworks, including LLM-as-judge pipelines, to measure quality and guide iteration
Partner closely with product, engineering, and design to deliver seamless, user-facing experiences
Balance experimentation with production rigor, ensuring performance, latency, and reliability at Spotify scale
Continuously improve agent behavior through tight feedback loops between evaluation and real-world usage

Who You Are

5+ years of experience building and shipping machine learning systems in production environments
Experienced with large language models and have worked on real-world applications beyond experimentation; shipped and maintained large scale systems with LLMs
Deep understanding of challenges in conversational or agentic systems, such as context handling and multi-step reasoning
Know how to evaluate ML systems rigorously and have experience designing metrics or evaluation pipelines
Comfortable debugging complex interactions between models, tools, and system constraints like latency
Care about building reliable, scalable systems that deliver high-quality user experiences
Enjoy working cross-functionally and contributing to a collaborative, inclusive team environment

Skills

Machine LearningLLMsConversational AIAgentic AILlm-As-JudgeContext ManagementMulti-Step ReasoningEvaluation FrameworksProduction Ml Systems

Similar roles

ML Engineering jobs

Traba

Senior Software Engineer

Build and own production AI agent systems (harnesses, evals, orchestration) on frontier LLMs for industrial supply chain workflows. Requires 5+ years software engineering with 1+ year shipping LLM/agent features, strong Python/TS, and high-agency customer immersion.

200k – 240kNew York, NY +1ML EngineeringOn-site5+ YOEWmsTms

Otter

Senior Machine Learning Engineer

Lead projects building and deploying large-scale ASR, NLP, and LLM systems for meeting intelligence. Requires 5+ years building production ML systems with PyTorch/JAX and experience with speech/language models.

230k – 265kMountain View, CAML EngineeringHybrid5+ YOEJAXAsr

Dialpad

Sr AI Engineer - Agentic Systems

Technical leader building and scaling production-grade multi-agent AI systems for real-time voice, workflow automation, and enterprise tool execution. Requires 8+ years experience and deep expertise in LLM platforms, agent frameworks, and distributed systems.

United StatesML EngineeringRemote8+ YOECrewaiVoice Ai

OpenAI

Agent Post-Training, Artifacts Research

Train frontier models to generate polished artifacts (docs, spreadsheets, slides) by owning post-training improvements across RL, data, evals, and alignment. Requires strong ML fundamentals and hands-on LLM/RL experience.

295k – 445kSan Francisco, CAML EngineeringOn-site7+ YOELLMsRLHF

OpenAI

Agent Post-Training, Computer Use Research

Train frontier models to operate computers, browsers, and desktops. Design experiments, build evals, own post-training pipelines (RL, data, graders), and ship improvements into OpenAI agents.

295k – 445kSan Francisco, CAML EngineeringOn-site7+ YOERLHFLLMs