Skip to content

AI Field Engineer

280k – 320kSan Mateo, CAOnsite3+ YOE
Summary

Technical lead for Microsoft co-sell motions, building POCs, deploying LLMs on vLLM/SGLang, guiding fine-tuning strategy, and owning partner feedback loops. Requires 3+ years in pre-sales or partner engineering with strong Python, LLM inference, and Azure AI experience.

About the role

Technical Delivery and Deployment

  • Be the technical lead on co-sell motions with Microsoft — joint reference architectures, Azure Foundry integration patterns, and shared POCs for strategic accounts.
  • Build end-to-end POCs and MVPs alongside partner engineering teams, working inside their codebases, infrastructure, and constraints.
  • Run load tests and establish latency, throughput, and cost baselines against realistic customer traffic profiles, and tune deployments to hit those targets.
  • Deploy and validate new model families on inference frameworks (vLLM, SGLang), determining optimal shapes, quantization configs, and serving patterns across workloads.

Model Strategy and Fine-Tuning

  • Guide Microsoft’s customers on model selection, fine-tuning strategy (SFT, DPO, RFT), and evaluation methodology.
  • Build and run fine-tuning pipelines directly with customers, navigating trade-offs between model families, compute cost, and quality targets.
  • Design and implement evaluation frameworks that measure production-quality metrics, not just benchmark scores.

Product Feedback and Platform Improvement

  • Own the feedback loop — surface partner-driven product gaps to Fireworks engineering, and translate the roadmap back into partner messaging.
  • Ship external technical content: reference architectures, integration guides, and benchmark posts that make it easy for partners to win deals with us.
  • Track pipeline health; flag risks and opportunities to Field leadership weekly.

Minimum Qualifications

  • 3+ years in a pre-sales, partner engineering, forward-deployed, or technical consulting role.
  • Demonstrated ability to build production software with customers, not just advise on it. You have shipped code running in someone else's production environment.
  • Strong Python skills. Comfortable reading, writing, and debugging production code. Familiarity with Kubernetes and infrastructure engineering.
  • Hands-on fluency with LLM inference: latency/throughput tradeoffs, batching strategies, quantization, structured outputs, function calling.
  • Real experience with fine-tuning — LoRA at minimum, RFT a strong plus.
  • Deep familiarity with the Azure AI stack: Azure Foundry, Azure OpenAI Service, Azure ML, AKS, Entra/RBAC for AI workloads.
  • Exceptional communication: able to run a sharp discovery call, present to a VP, and debug a latency issue with an ML engineer in the same afternoon.

Preferred Qualifications

  • 5+ years in technical field or engineering roles where you've owned a technical relationship with a hyperscaler or major SI.
  • Experience with inference serving frameworks (vLLM, SGLang, TensorRT-LLM) and tuning deployments for real workloads.
  • Prior role at a hyperscaler, AI-native cloud, or inference provider.
  • Experience with agentic frameworks (LangChain, LlamaIndex, or custom tool-use pipelines).
  • Background in model evaluation.
  • Written a technical blog post or reference architecture that people actually read.
  • Track record taking GenAI POCs from prototype to production-scale deployments.
Skills
PythonKubernetesvLLMSGLangTensorRT-LLMAzure AIAzure OpenAI ServiceAzure MLAKSLoRALangChainLlamaIndex
Similar roles at this salary range
All Solutions Architecture jobs →
Zocdoc

Director, Solutions Architecture & Professional Services

Lead and scale a unified Solutions Architecture and Professional Services team for enterprise health systems. Own pre-sale solution design, delivery governance, and services catalog while partnering with Sales, Product, and Engineering.

250k – 270kNew York, NYSolutions ArchitectureHybrid10+ YOEEnterprise SaaSClient Services
Retool

Manager, Solutions Architecture

Lead a team of Solutions Architects to implement full-stack Retool solutions, guide enterprise customers and partners on complex internal tooling projects, and develop scalable professional services offerings.

259k – 287kUnited StatesSolutions ArchitectureHybrid5+ YOESQLAPIs
Temporal

Staff Solutions Architect

Technical pre-sales role helping prospects evaluate Temporal platform suitability, architect solutions, and expand usage. Requires strong distributed systems expertise and ability to prototype in Java, Go, TypeScript, Python, or PHP.

250k – 300kUnited StatesSolutions ArchitectureRemote7+ YOEPHPJava
Cloudflare

Senior Solutions Engineer

Customer-facing Senior Solutions Engineer responsible for technical pre-sales, solution mapping, demos, and POCs for Cloudflare's CDN, security, and networking products to enterprise accounts.

234k – 292kUnited StatesSolutions ArchitectureRemote5+ YOECDNAWS
Replit

Head of Forward Deployed Engineering

Lead and scale a Forward Deployed Engineering team that embeds with strategic enterprise customers to build production AI-native systems, drive adoption, and feed learnings back into the Replit platform. Requires 12+ years of engineering leadership experience and deep technical judgment.

300k – 400kFoster City, CASolutions ArchitectureHybrid12+ YOEAI AgentsApplied AI