Machine Learning Engineer, Distributed Data Systems
Designs and scales distributed data infrastructure for large-scale multimodal AI training and evaluation. Collaborates with researchers to build reliable, high-performance systems in a fast-paced environment.
In this role, you will:
- Design, build, and maintain data infrastructure systems such as distributed compute, data orchestration, distributed storage, streaming infrastructure, machine learning infrastructure while ensuring scalability, reliability, and security.
- Ensure our data platform can scale by orders of magnitude while remaining reliable and efficient.
- Partner with researchers to deeply understand requirements and translate them into production-ready systems.
- Harden, optimize, and maintain critical data infrastructure systems that power multimodal training and evaluation.
You might thrive in this role if you:
- Have strong experience with distributed systems and large-scale infrastructure with a strong interest in data.
- Are detail-oriented and bring rigor to building and maintaining reliable systems.
- Demonstrate excellent software engineering fundamentals and organizational skills.
- Are comfortable with ambiguity and rapid change.
Senior Staff Machine Learning Engineer, Communication & Connectivity
Lead ML architecture and implementation for Airbnb's Messaging & Notifications, building recommendation engines, ranking systems, and LLM-powered experiences while mentoring engineers.
Staff Software Engineer
Founding Staff Applied Agent Engineer to architect and lead Traba's agentic platform, building production LLM/agent systems that integrate with customer WMS/TMS/ERP and drive industrial operations. Requires 7+ years engineering experience with 2+ years building production agent systems.
Member of Technical Staff — Model Optimization and Inference
Optimize inference for real-time multimodal AI avatars. Specialize in LLM and diffusion model serving, KV cache strategies, quantization, and low-latency frameworks like vLLM and TensorRT-LLM.