Customer Support Engineer (GPU Cluster)

Resolves complex technical issues for customers using Kubernetes GPU clusters in AI training and inference. Requires 3+ years customer-facing experience, AI/ML/GPU expertise, and infrastructure skills like Kubernetes and Ansible.

160k – 230kSan Francisco, CASupport EngineeringRemote3+ YOE

Apply

About the role

Responsibilities

Engage directly with customers to tackle and resolve complex technical challenges involving Kubernetes GPU clusters; ensure swift and effective solutions.
Become a product expert in GPU Cluster service, serving as the last line of technical defense before escalation.
Collaborate across Engineering, Research, and Product teams to address customer concerns and ensure satisfaction.
Transform customer insights into action by identifying patterns and driving roadmap improvements.
Maintain documentation of system configurations, procedures, troubleshooting guides, and FAQs.
Provide support coverage during holidays, nights, and weekends as required.

Requirements

3+ years in customer-facing technical role with 1+ year in support for AI or mission-critical SaaS API.
Strong background in AI, ML, GPU technologies, and HPC environments.
Familiarity with Kubernetes, SLURM, Ansible, high-performance networks, NFS storage, containers, scripting/programming.
Foundational knowledge in compute cluster installation, configuration, administration, troubleshooting, and security.
Complex technical problem solving and proactive issue resolution.
Cross-functional collaboration with Sales, Engineering, Support, Product, Research.
Strong ownership, willingness to learn, communication skills for technical/non-technical audiences.
Ability to manage dynamic environments, multiple projects, context switching.

Compensation

US base salary: $160,000-$230,000 + equity + benefits. Compensation determined by location, level, experience, skills.

Skills

KubernetesGPUAIMachine LearningSlurmAnsibleHpcNfsContainersScripting

Similar roles

Support Engineering jobs

Together AI

Customer Support Engineer (Inference)

Customer Support Engineer providing technical support for AI inference and fine-tuning services on GPU clusters. Requires 5+ years customer-facing technical experience with strong AI/ML and infrastructure expertise.

160k – 230kSan Francisco, CASupport EngineeringRemote5+ YOEGPUHpc

Blacksmith

Technical Support Engineer

Technical Support Engineer diagnoses and resolves complex issues for customers using Blacksmith's high-scale CI infrastructure, reproduces bugs with engineering, and builds automations. Requires experience with distributed systems, AI/agent workflows, and customer-focused technical support.

160k – 180kNew York, NYSupport EngineeringOn-siteCephLinux

Nooks

Manager, Technical Support - AI Sequencing

Leads and scales technical support team for AI Sequencing product, managing KPIs across chat/email/Slack, building AI automations/knowledge base, and driving cross-functional fixes with Eng/PM. Requires 5+ years technical support with 2+ in leadership at B2B SaaS startups.

154k – 206kSan Francisco, CASupport EngineeringHybrid5+ YOEAPIsVoip

Gigs

Support Operations Engineer

Support Operations Engineer owns customer interactions end-to-end, identifies recurring issues for product/engineering improvements, maintains AI-powered knowledge bases and support tooling, and analyzes data for operational enhancements in a B2B tech environment.

150k – 180kNew York, NYSupport EngineeringHybridAISQL

Onebrief

Customer Success Engineer, Battle Road

Deploys and configures AtomEngine simulation platform in military training environments, troubleshoots issues live, and trains stakeholders. Requires 3+ years software engineering experience, programming proficiency (C#, Python, etc.), Secret clearance, and up to 50% travel.

150k – 185kUnited StatesSupport EngineeringRemote3+ YOEC#C++