Skip to content

Research Intern, Inference

121k – 131kSan Francisco, CAOnsiteEntry level
Summary

Research intern on the Inference team building efficient serving systems for large foundation models. Focus on distributed inference, compiler-aware optimization, and novel inference-time strategies.

About the role

Responsibilities

  • Design and conduct rigorous experiments to validate hypotheses
  • Communicate the plans, progress, and results of projects to the broader team
  • Document findings in scientific publications and blog posts

Requirements

  • Currently pursuing a final year of Bachelor's, Master's, or Ph.D. degree in Computer Science, Electrical Engineering, or a related field
  • Strong knowledge of Machine Learning and Deep Learning fundamentals
  • Experience with deep learning frameworks (PyTorch, JAX, etc.)
  • Strong programming skills in Python
  • Familiarity with Transformer architectures and recent developments in foundation models

Preferred Qualifications

  • Prior research experience in foundation models, efficient machine learning, or ML systems
  • Publications at leading conferences in machine learning or systems (i.e., MLSys, ICLR)
  • Experience with CUDA programming (for kernel development)
  • Understanding of model optimization techniques and hardware acceleration approaches
  • Contributions to open-source machine learning projects

Internship Program Details

  • Fall internship program spans 12 to 16 weeks
  • Dates: September 14th to December 18th
Skills
PyTorchJAXPythonMachine LearningDeep LearningTransformer architecturesCUDAML systems
Similar roles at this salary range
All ML Engineering jobs →
Together AI

Systems Research Engineer Intern - GPU Programming

Intern developing and optimizing GPU-accelerated kernels for ML/AI applications. Requires strong GPU programming background (CUDA/Triton) and knowledge of performance optimization.

121k – 131kSan Francisco, CAML EngineeringOn-siteEntry levelCUDATriton
Pinterest

Machine Learning Engineer II, Computer Vision Applied Science

Build and fine-tune vision-centric VLMs and generative models using Pinterest's visual-text datasets. Requires 2+ years industry computer vision experience and an M.S. or Ph.D.

139k – 286kSan Francisco, CAML EngineeringRemote2+ YOELLMsRLHF
Mariana Minerals

Machine Learning Engineer

Build and deploy reinforcement learning models to autonomously control mineral refining facilities, optimizing recovery rates, energy use, and uptime in real operating plants.

120k – 160kAnn Arbor, MI +2ML EngineeringOn-siteEntry levelPythonDeep Learning
Sift

Machine Learning Engineer

Build and deploy large-scale ML models for real-time fraud detection, engineering features from 1T+ events and maintaining production MLOps infrastructure on GCP. Requires 4+ years experience with Java/Scala, Python, Spark/Flink, and distributed systems.

140k – 190kUnited StatesML EngineeringRemote4+ YOEGCPJava
Mozilla

Senior Machine Learning Engineer, AI Platform

Design, build, and operate Mozilla's AI platform for training, deploying, and serving ML models at scale. Requires 4-6 years experience building production ML systems with strong Python and GPU/cloud infrastructure skills.

139k – 218kUnited StatesML EngineeringRemote4+ YOECI/CDPython