Systems Research Engineer Intern - GPU Programming
Intern developing and optimizing GPU-accelerated kernels for ML/AI applications. Requires strong GPU programming background (CUDA/Triton) and knowledge of performance optimization.
Responsibilities
- Optimize and fine-tune GPU code to achieve better performance and scalability
- Collaborate with cross-functional teams to integrate GPU-accelerated solutions into existing software systems
- Stay up-to-date with the latest advancements in GPU programming techniques and technologies
Requirements
- Strong background in GPU programming and parallel computing, such as CUDA and/or Triton
- Knowledge of ML/AI applications and models
- Knowledge of performance profiling and optimization tools for GPU programming
- Excellent problem-solving and analytical skills
Internship Program Details
- Fall internship program spans 12 to 16 weeks (September 14th to December 18th)
- Opportunity to work with industry-leading engineers building a cloud from the ground up
- Possibility to contribute to influential open source projects
Compensation
- Estimated US hourly rate: $58 to $63
- Competitive compensation, housing stipends, and other competitive benefits
Research Intern, Inference
Research intern on the Inference team building efficient serving systems for large foundation models. Focus on distributed inference, compiler-aware optimization, and novel inference-time strategies.
Machine Learning Engineer II, Computer Vision Applied Science
Build and fine-tune vision-centric VLMs and generative models using Pinterest's visual-text datasets. Requires 2+ years industry computer vision experience and an M.S. or Ph.D.
Machine Learning Engineer
Build and deploy large-scale ML models for real-time fraud detection, engineering features from 1T+ events and maintaining production MLOps infrastructure on GCP. Requires 4+ years experience with Java/Scala, Python, Spark/Flink, and distributed systems.
Senior Machine Learning Engineer, AI Platform
Design, build, and operate Mozilla's AI platform for training, deploying, and serving ML models at scale. Requires 4-6 years experience building production ML systems with strong Python and GPU/cloud infrastructure skills.