Machine Learning Engineer, Personalization
New York, NYRemote
Summary
Develop ML systems using LLMs for content enrichment in music, podcasts, and audiobooks at Spotify. Collaborate cross-functionally to build scalable pipelines and improve personalization recommendations.
About the role
What You'll Do
- Utilize in-house and 3rd party LLMs to solve language understanding problems
- Employ techniques such as fine-tuning and RAG to improve models
- Contribute to designing, building, evaluating, shipping, and refining Spotify’s product by hands-on ML development
- Help drive optimization, testing, and tooling to improve quality of our content enrichment assets
- Collaborate with cross-functional teams of MLEs, data and backend engineers, and other stakeholders including tech research, data science, and product to develop new features and technologies
- Be a participant in our AI Foundation’s ML community and work collaboratively and efficiently within our existing platforms and systems
- Perform data analysis to establish baselines and inform product decisions
- Stay up-to-date on the latest machine learning algorithms and techniques
Who You Are
- Strong background in machine learning, especially experience with Large Language Models
- Professional experience in applied machine learning
- Extensive experience working in a product and data-driven environment (Python, Scala, Java, SQL, with Python experience required) and cloud platforms (GCP or AWS)
- Hands-on experience implementing or prototyping machine learning systems at scale
- Experience architecting data pipelines and are self-sufficient in getting the data you need to build and evaluate models, using tools like Dataflow, Apache Beam, or Spark
- Care about agile software processes, data-driven development, reliability, and disciplined experimentation
- Experience and passion for fostering collaborative teams
- Experience with PyTorch, TensorFlow, and/or other scalable Machine learning frameworks. Experience with Ray or TFX is a plus
- Bonus if you have experience with architecting near real time pipelines
Skills
Large Language ModelsPythonPyTorchTensorFlowGCPAWSApache BeamSparkDataflowRay