Solution Architect (AI/LLM Inference)
165k – 330kSan Francisco, CANew York, NYSolutions ArchitectureHybrid
Summary
Partners with Sales to conduct technical discovery, lead demos, scope AI/LLM inference deployments, and manage POCs for customers adopting AI at scale. Requires AI/ML expertise, customer-facing skills, and ability to prototype solutions.
About the role
Responsibilities
- Partner with Sales on customer discovery calls (most often second calls, occasionally first calls for large accounts).
- Lead demos and technical scoping to align on success criteria, architecture, and deployment approach.
- Own benchmarking and repeatable deployments, including:
- Handling standard deployment patterns and configurations across many modalities – LLMs, embeddings, image and video generation, VoiceAI, etc.
- Advising on tradeoffs like H100s vs B200s and latency-optimized vs throughput-optimized setups.
- Driving consistent "playbook" style deployments for common models and use cases.
- Become a power user of different runtimes such as vllm, sglang, and TRT-LMM and all the common configurations and tradeoffs between them.
- Drive POC and project execution, including:
- Scoping POCs and keeping stakeholders aligned on timeline, deliverables, and next steps.
- Acting as the "ringleader" or project manager for POCs.
- Pulling in Forward Deployed Engineering (FDE) support when deeper or more complex technical work is needed.
Requirements
- AI/ML background and the ability to credibly discuss AI/ML topics with technical stakeholders.
- Strong customer-facing communication skills, including the ability to run structured discovery and clarify ambiguous requirements.
- Technical depth to scope solutions, without needing to write production code.
- Ability to script and prototype as needed, including comfort "vibe coding" to move quickly in technical workflows.
Nice to Have
- Experience running or supporting benchmarks for ML inference deployments.
- Familiarity with infrastructure tradeoffs relevant to inference performance and cost (for example GPU selection and latency versus throughput tuning).
- Experience serving as a cross-functional technical lead for customer POCs, including coordination across Sales and Engineering.
Benefits
- Competitive compensation, including meaningful equity.
- 100% coverage of medical, dental, and vision insurance for employee and dependents.
- Flexible PTO policy including company wide Winter Break (offices closed from Christmas Eve to New Year's Day).
- Paid parental leave.
- Fertility and family-building stipend through Carrot.
- Company-facilitated 401(k).
- Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.
Skills
AI/MLLLMsvLLMsglangTRT-LLMGPUH100B200embeddingsinferencebenchmarksPythonscripting
Similar roles at this salary range
All Solutions Architecture jobs →Forward Deployed Engineer (Integrations)
Own technical integration delivery for priority customers, writing TypeScript/Node.js code to build and debug integrations with payments systems, cloud platforms, and third-party APIs. Requires 3+ years experience and strong customer-facing technical ownership.
160k – 220kSan Francisco, CASolutions ArchitectureRemote3+ YOEGCPJWT
Forward Deployed Engineer
Partner with banking customers to architect and deploy AI-native software solutions, owning end-to-end technical projects from design to production. Requires 1-6+ years of software engineering experience and strong communication skills.
180k – 250kSan Francisco, CASolutions ArchitectureOn-site1+ YOEReactPython