Senior Machine Learning Engineer, AI Platform

139k – 218kUnited StatesRemote4+ YOEJun 9

Summary

Design, build, and operate Mozilla's AI platform for training, deploying, and serving ML models at scale. Requires 4-6 years experience building production ML systems with strong Python and GPU/cloud infrastructure skills.

About the role

What You’ll Do

Design, build, and operate core AI platform components used to train, deploy, and serve machine learning models in production environments.
Own model serving and inference workflows end-to-end, driving improvements in reliability, scalability, performance, and operational excellence.
Lead efforts to optimize inference systems for throughput, latency, and cost efficiency across CPU and GPU workloads.
Design and manage GPU-based inference and training workloads, including performance tuning, capacity planning, and resource utilization optimization.
Own and improve critical parts of the model lifecycle, including packaging, versioning, testing strategies, validation, and deployment automation.
Implement and evolve observability practices (metrics, logging, tracing, alerting) to improve visibility and operational resilience of ML services and pipelines.
Partner closely with product, infrastructure, security, and data teams to design scalable platform capabilities that enable AI-powered features.
Contribute to technical design discussions, propose architectural improvements, and mentor junior engineers through code reviews and knowledge sharing.
Participate in and help improve operational processes, including incident response, on-call rotations, and post-incident reviews.

What You’ll Bring

Bachelor’s degree with 4–6 years of relevant industry experience, or Master’s degree with significant hands-on experience building and operating production ML systems, or work experience equivalent
Strong experience developing in Python for machine learning systems, backend services, or distributed data processing.
Proven experience deploying and operating ML workloads in cloud environments, including production-grade infrastructure.
Solid understanding of model serving architectures, inference pipelines, and performance tradeoffs (latency, throughput, cost, scaling strategies).
Hands-on experience working with GPU-based workloads and accelerated computing in production settings.
Experience designing CI/CD pipelines and development workflows that support reliable ML system deployment.
Ability to independently scope and drive technical initiatives while balancing product and operational priorities.
Strong problem-solving skills and the ability to debug performance and reliability issues in distributed systems.
Clear and effective communication skills, with experience collaborating across engineering, product, and infrastructure teams.

Bonus Skills

Experience implementing inference optimization strategies such as batching, quantization, compilation, model conversion, or hardware-specific tuning.
Familiarity with containerization and orchestration systems (e.g., Docker, Kubernetes) in production environments.
Experience designing observability systems for distributed services, including metrics strategy and performance profiling.
Exposure to privacy-preserving ML techniques, security best practices, or responsible AI system design.
Contributions to open-source ML infrastructure projects or leadership in building reusable internal ML tooling.

What you’ll get

Generous performance-based bonus plans to all eligible employees
Rich medical, dental, and vision coverage
Generous retirement contributions with 100% immediate vesting
Quarterly all-company wellness days
Country specific holidays plus a day off for your birthday
One-time home office stipend
Annual professional development budget
Quarterly well-being stipend
Considerable paid parental leave
Employee referral bonus program
Other benefits (life/AD&D, disability, EAP, etc. - varies by country)

Skills

PythonMachine LearningModel ServingInference OptimizationGPU WorkloadsCI/CDDockerKubernetesObservabilityDistributed Systems

Similar roles at this salary range

All ML Engineering jobs →

Together AI

Jun 12

Systems Research Engineer Intern - GPU Programming

Intern developing and optimizing GPU-accelerated kernels for ML/AI applications. Requires strong GPU programming background (CUDA/Triton) and knowledge of performance optimization.

121k – 131kSan Francisco, CAML EngineeringOn-siteEntry levelCUDATriton

Together AI

Jun 12

Research Intern, Inference

Research intern on the Inference team building efficient serving systems for large foundation models. Focus on distributed inference, compiler-aware optimization, and novel inference-time strategies.

121k – 131kSan Francisco, CAML EngineeringOn-siteEntry levelJAXCUDA

Jun 11

Machine Learning Engineer II, Computer Vision Applied Science

Build and fine-tune vision-centric VLMs and generative models using Pinterest's visual-text datasets. Requires 2+ years industry computer vision experience and an M.S. or Ph.D.

139k – 286kSan Francisco, CAML EngineeringRemote2+ YOELLMsRLHF

Mariana Minerals

Jun 10

Staff Machine Learning Engineer

Staff ML Engineer setting technical direction for autonomous mineral refining using reinforcement learning and simulation. Owns modeling, validation, and deployment of control systems on live industrial equipment.

160k – 200kAnn Arbor, MIML EngineeringOn-site8+ YOESimulationDigital Twins

Mariana Minerals

Jun 10

Machine Learning Engineer

Build and deploy reinforcement learning models to autonomously control mineral refining facilities, optimizing recovery rates, energy use, and uptime in real operating plants.

120k – 160kAnn Arbor, MI +2ML EngineeringOn-siteEntry levelPythonDeep Learning

Apply