Solution Architect (AI/LLM Inference)

165k – 330kSan Francisco, CANew York, NYSolutions ArchitectureHybridMay 10

Summary

Partners with Sales to conduct technical discovery, lead demos, scope AI/LLM inference deployments, and manage POCs for customers adopting AI at scale. Requires AI/ML expertise, customer-facing skills, and ability to prototype solutions.

About the role

Responsibilities

Partner with Sales on customer discovery calls (most often second calls, occasionally first calls for large accounts).
Lead demos and technical scoping to align on success criteria, architecture, and deployment approach.
Own benchmarking and repeatable deployments, including:
- Handling standard deployment patterns and configurations across many modalities – LLMs, embeddings, image and video generation, VoiceAI, etc.
- Advising on tradeoffs like H100s vs B200s and latency-optimized vs throughput-optimized setups.
- Driving consistent "playbook" style deployments for common models and use cases.
Become a power user of different runtimes such as vllm, sglang, and TRT-LMM and all the common configurations and tradeoffs between them.
Drive POC and project execution, including:
- Scoping POCs and keeping stakeholders aligned on timeline, deliverables, and next steps.
- Acting as the "ringleader" or project manager for POCs.
- Pulling in Forward Deployed Engineering (FDE) support when deeper or more complex technical work is needed.

Requirements

AI/ML background and the ability to credibly discuss AI/ML topics with technical stakeholders.
Strong customer-facing communication skills, including the ability to run structured discovery and clarify ambiguous requirements.
Technical depth to scope solutions, without needing to write production code.
Ability to script and prototype as needed, including comfort "vibe coding" to move quickly in technical workflows.

Nice to Have

Experience running or supporting benchmarks for ML inference deployments.
Familiarity with infrastructure tradeoffs relevant to inference performance and cost (for example GPU selection and latency versus throughput tuning).
Experience serving as a cross-functional technical lead for customer POCs, including coordination across Sales and Engineering.

Benefits

Competitive compensation, including meaningful equity.
100% coverage of medical, dental, and vision insurance for employee and dependents.
Flexible PTO policy including company wide Winter Break (offices closed from Christmas Eve to New Year's Day).
Paid parental leave.
Fertility and family-building stipend through Carrot.
Company-facilitated 401(k).
Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.

Skills

AI/MLLLMsvLLMsglangTRT-LLMGPUH100B200embeddingsinferencebenchmarksPythonscripting

Similar roles at this salary range

All Solutions Architecture jobs →

Firecrawl

Jun 19

Forward Deployed Engineer (Integrations)

Own technical integration delivery for priority customers, writing TypeScript/Node.js code to build and debug integrations with payments systems, cloud platforms, and third-party APIs. Requires 3+ years experience and strong customer-facing technical ownership.

160k – 220kSan Francisco, CASolutions ArchitectureRemote3+ YOEGCPJWT

Casca

Jun 19

Forward Deployed Engineer

Partner with banking customers to architect and deploy AI-native software solutions, owning end-to-end technical projects from design to production. Requires 1-6+ years of software engineering experience and strong communication skills.

180k – 250kSan Francisco, CASolutions ArchitectureOn-site1+ YOEReactPython

Vapi

Jun 18

Agent Engineer - NY

Lead technical discovery, demos, and POCs for enterprise voice AI deals. Partner with AEs to close technical sales and support go-live for customers replacing IVR and contact center systems.

160k – 180kNew York, NYSolutions ArchitectureHybrid4+ YOESIPFive9

Vapi

Jun 18

Agent Architect

Own technical discovery, demos, and POCs for enterprise voice AI deals. Partner with AEs to close technical sales and support go-live for customers replacing IVR and contact center systems.

160k – 180kSan Francisco, CASolutions ArchitectureHybrid4+ YOESIPFive9

Tennr

Jun 18

Solutions Consultant

Lead product demos and build tailored proof of concepts for healthcare SaaS prospects. Partner with sales and product teams to drive deals and articulate technical value to CIOs and COOs.

140k – 160kNew York, NYSolutions ArchitectureOn-site2+ YOESystem DesignRFP Responses

Apply