Full-Stack AI Engineer
156k – 270kSan Francisco, CAML EngineeringHybrid5+ YOE
Summary
Design and deploy end-to-end AI systems powering real products. Own the full intelligence stack including retrieval, agent orchestration, evals, and governance. Requires Python, LLM experience, and production AI system building.
About the role
What You'll Own
- Designing, building, and deploying end-to-end AI systems across the full intelligence stack — from data context and retrieval to agent orchestration, evals, versioning, and governance.
- Developing agent pipelines, prompt chains, and orchestration frameworks for LLM-driven workflows.
- Selecting the right AI technique (LLMs, classic ML, or hybrid approaches) for the problem at hand.
- Collaborating with PMs, engineers, and data scientists to define requirements and deliver solutions.
- Building scalable AI services with monitoring, evaluation, and deployment pipelines.
- Contributing reusable patterns to Komodo's AI infrastructure and internal tooling ecosystem.
Requirements
- Experience building production-grade AI systems or AI-powered applications.
- Strong proficiency in Python.
- Experience working with LLMs, prompt engineering, or agent-based architectures.
- Familiarity with modern GenAI tooling and frameworks: vLLM, CrewAI, Strands, OpenAI / Chat Completions APIs.
- Ability to integrate AI capabilities across backend services and product interfaces.
- Experience designing evaluation frameworks, testing strategies, or monitoring systems for AI features.
- Strong collaboration skills across engineering, product, and data teams.
Nice to Have
- Healthcare data expertise.
- Experience with distributed computing frameworks (e.g., Spark, Snowflake, Databricks) for large-scale data processing.
Skills
PythonLLMsPrompt EngineeringAgent-based ArchitecturesvLLMCrewAIStrandsOpenAI APIEvaluation FrameworksMonitoring Systems
Similar roles at this salary range
All ML Engineering jobs →Senior Machine Learning Operations Engineer
Build and operate Mercury's real-time ML inference platform for fraud risk decisioning. Own model deployment, observability, and lifecycle tooling with strong backend Python fundamentals.
167k – 208kSan Francisco, CA +2ML EngineeringHybrid5+ YOESQLSHAP
AI Engineer, Evaluation
Design and implement evaluation frameworks and pipelines for AI systems using Evaluation-Driven Development. Build Python-based test suites, LLM graders, and measurement systems that guide prompt iteration and production deployment decisions.
150k – 250kSan Francisco, CA +1ML EngineeringHybrid2+ YOEPythonAI Systems