AI Engineer

Builds and deploys production-scale AI/ML systems using LLMs, from fine-tuning and evaluation to low-latency infrastructure. Requires 5+ years experience with PyTorch/TensorFlow, MLOps, AWS, and taking models to production at high-growth startups.

200k – 250kNew York, NYML EngineeringHybrid5+ YOE

Apply

About the role

Responsibilities

Own the complete lifecycle of large language model implementation: from data preparation and fine-tuning through rigorous evaluation and production deployment.
Develop automated evaluation frameworks that continuously assess model accuracy, identify edge cases, and quantify improvements across iterations.
Work directly with product managers and engineers to integrate AI as a core product capability.
Shape our AI roadmap by staying current with industry developments, evaluating emerging techniques, and making pragmatic adoption decisions.
Design and implement low-latency, high-throughput, cloud-based AI/ML systems capable of handling thousands of requests per second.
Build the foundational infrastructure - model serving, monitoring, deployment pipelines, and automated testing frameworks - that enables rapid experimentation and iteration while maintaining production reliability.

Requirements

5-7+ years of engineering experience with demonstrated hands-on knowledge of applying LLMs and agents in industry.
Experience at a high-growth startup building machine learning infrastructure from the ground up.
Demonstrated ability to take models from research/experimentation through production deployment at scale.
Fluency in Python and related AI/ML frameworks (TensorFlow, PyTorch, Keras, etc.).
Hands-on experience with LLMs and contemporary AI engineering patterns: RAG architectures, embedding models, vector databases, prompt engineering, and fine-tuning strategies.
Curious, systematic, and execution-oriented—you don't wait for perfect requirements and can navigate technical tradeoffs independently.
Strong foundation in MLOps: CI/CD for ML, model versioning, monitoring, and observability.
Strong technical background in AWS cloud architecture and automated infrastructure provisioning with Terraform.

Nice-to-haves

Experience with agentic frameworks like LangChain.

Skills

PythonPyTorchTensorFlowKerasLLMsRAGMLOpsAWSTerraformLangChainVector DatabasesPrompt EngineeringFine-TuningCI/CDModel Serving

Similar roles

ML Engineering jobs

Snowflake

AI System Research and Development Engineer - Optimization

Develop and optimize GPU kernels and deep learning systems for LLM training and inference at Snowflake AI Research. Requires 5+ years in GPU/HPC optimization and strong proficiency in PyTorch, TensorFlow, JAX, and CUDA.

200k – 265kBellevue, WAML EngineeringOn-site5+ YOEJAXCUDA

Baseten

Post-Training Research Engineer

Build in-house tooling for post-training custom ML models using advanced techniques like RL and finetuning. Requires deep expertise in transformer training, PyTorch distributed systems, parallelism strategies, GPU performance optimization, and HPC platforms.

200k – 275kSan Francisco, CAML EngineeringHybridJAXRay

Glean

Machine Learning Engineer, Enterprise Brain

Develop ML systems for the Enterprise Brain, focusing on proactive AI for task prediction, automation, and agentic workflows using LLMs and advanced techniques. Requires 3+ years ML experience, Python proficiency, and expertise in evaluation and production systems.

200k – 300kPalo Alto, CA +1ML EngineeringHybrid3+ YOELLMsPython

Cantina

Machine Learning Engineer, Images

Designs, fine-tunes, and deploys image generation models for photorealistic AI bots, optimizing for consistency, latency, and quality. Requires 5+ years software engineering, 2+ years production ML, and expertise in diffusion models like Stable Diffusion and PyTorch.

200k – 265kSan Francisco, CAML EngineeringRemote5+ YOEGCPAWS

Together AI

Research Engineer, Core ML

Research Engineer building production ML systems at the intersection of efficient inference, RL/post-training, and serving engines. Translates algorithms into scalable infrastructure improving latency, throughput, and model quality. Requires 3+ years ML systems experience and advanced degree.

200k – 280kSan Francisco, CAML EngineeringOn-site3+ YOEDpovLLM