Staff Machine Learning Systems Engineer, Embeddings Platform

253k – 355kUnited StatesRemote8+ YOEJun 11

Summary

Staff ML Systems Engineer leading large-scale embedding and recommendation model architecture, distributed training, and real-time serving. Owns ML strategy and mentors engineers on personalization systems.

About the role

What You’ll Do

Architect and lead the development of next-generation, large-scale machine learning techniques.
Define and execute the ML strategy, identifying opportunities to enhance personalization and recommendation quality across Reddit.
Lead research initiatives on scalable machine learning systems and real-time model adaptation, bringing cutting-edge advancements into production.
Partner with ML infrastructure teams to build high-performance, distributed training systems that efficiently scale across multiple GPUs and cloud environments.
Establish and optimize real-time serving architectures for large-scale embeddings, ensuring low-latency inference and high throughput.
Collaborate cross-functionally with teams in Feed Ranking, Ads, Content Understanding, and Core ML to integrate ML models into Reddit’s key AI-driven systems.
Mentor and guide senior and mid-level ML engineers, fostering a culture of excellence, innovation, and knowledge sharing.
Stay at the forefront of AI research, evaluating and introducing new modeling paradigms to keep Reddit’s ML ecosystem cutting-edge.
Drive technical discussions, present findings to leadership, and contribute to long-term ML planning and decision-making.

Who You Might Be

8+ years of experience in machine learning engineering, with a strong focus on large-scale ML systems and recommendation or personalization systems.
Expertise in modern deep learning architectures, including sequence models and foundational models.
Deep understanding of complex multi-entity relationships in machine learning applications and how they are modeled in large-scale systems.
Proven ability to design, implement, and optimize scalable ML architectures, from distributed training to real-time inference.
Strong software engineering skills in Python, C++, or similar languages, with experience in ML infrastructure, high-performance computing, and cloud-based ML pipelines.
Demonstrated leadership in driving ML strategy, mentoring engineers, and influencing cross-functional teams.
Experience with A/B testing, model evaluation frameworks, and real-time feedback loops in large-scale production systems.
Excellent communication skills, with the ability to effectively present complex ML concepts to technical and non-technical stakeholders.

Benefits

Comprehensive Healthcare Benefits and Income Replacement Programs
401k with Employer Match
Global Benefit programs that fit your lifestyle, from workspace to professional development to caregiving support
Family Planning Support
Gender-Affirming Care
Mental Health & Coaching Benefits
Flexible Vacation & Paid Volunteer Time Off
Generous Paid Parental Leave

Skills

PythonC++Deep LearningDistributed TrainingReal-time InferenceA/B TestingModel EvaluationML InfrastructureHigh-Performance ComputingCloud-based ML Pipelines

Similar roles at this salary range

All ML Engineering jobs →

Coinbase

Jun 12

Staff Machine Learning Engineer

Staff ML Engineer leading end-to-end identity verification ML systems including document authenticity, face matching, liveness detection, GNN-based identity graphs, and behavioral risk models. Requires 8+ years production ML experience and domain expertise in biometrics or fraud detection.

218k – 257kUnited StatesML EngineeringRemote8+ YOENLPLLMs

Hinge Health

Jun 12

Staff Machine Learning Scientist

Own ML systems for send-time optimization, propensity modeling, and nudge decisions at consumer scale. Set experimentation standards and mentor a small ML team.

205k – 307kSan Francisco, CAML EngineeringHybrid7+ YOESQLdbt

Docker

Jun 12

Staff ML Engineer

Founding Staff ML Engineer building production ML systems for governance, security, and agentic platform capabilities at Docker. Owns architecture, data pipelines, evaluation, and model lifecycle while mentoring the growing team.

205k – 330kPalo Alto, CA +1ML EngineeringRemote8+ YOELLMsRetrieval

Jun 12

Principal Engineer, AI Platform

Principal Engineer setting technical vision and building AI/ML infrastructure for Generative AI and Recommender Systems at Pinterest, scaling to hundreds of millions of inferences per second. Requires deep expertise in distributed systems and proven cross-org technical leadership.

243k – 500kSan Francisco, CAML EngineeringHybrid7+ YOEC++Java

Jun 12

Senior Research Engineer, Post-training & Evaluation

Own evaluation science and post-training methodology for Reddit's foundational LLMs. Define benchmarks, design model-as-a-judge systems, and set SFT recipes that turn base models into safe, Reddit-native endpoints.

230k – 322kUnited StatesML EngineeringRemote6+ YOESFTCPT

Apply