# Software Engineer, ML Serving
**Company:** [Unusual](https://hotfix.jobs/companies/unusual)
**Location:** San Francisco, CA
**Skills:** Nvidia Triton, vLLM, Sglang, Tensor Parallel, Pipeline Parallel, Docker, Kubernetes, Terraform, Linux, gRPC
**Posted:** 2026-06-24
> Own the serving infrastructure connecting ML inference engines to production, building real-time TTS systems on GPU fleets with distributed model serving and cloud infrastructure.
## Job Description
## What You'll Own
- Architecture and implementation of Rime's TTS serving infrastructure, from GPU-backed inference engines to the API surface.
- Model optimization from a single-node to disaggregated fleet serving.
- Compatibility with different NVIDIA hardwares from Hopper to Blackwell and beyond for on-prem and cloud deployments.
- Continuous integration and deployment workflows for the model serving pipeline.
- Site reliability: on-call rotation, monitoring, alerting, and observability across the serving stack.
- Resource provision, cost management across our GPU fleet.

## What We're Looking For
- Hands-on experience with real-time multinode ML serving infrastructure — ML serving framework experience: NVIDIA Dynamo/Triton, vLLM, SGLang, or equivalent.
- Experience with distributed or disaggregated model serving (Tensor Parallel, Pipeline Parallel, or equivalent).
- Strong cloud infrastructure fundamentals: Linux internals, networking, containerization (Docker, Kubernetes).
- IaC experience — Terraform, Packer, or comparable.
- On-call is part of the job. You treat production reliability as a shared responsibility.

## Nice to Have
- Experience with multinode training (DDP, FSDP, etc.).
- Experience with gRPC or other bidirectional binary streaming protocols.
- Experience with audio streaming and related technologies (WebRTC, WebSockets, etc.).
- Experience with a multilingual monorepo where you pick the best language out of merit more than personal experience.
- Experience with multi-cloud infrastructures (AWS, GCP, OCI, etc.).
- Comfort with configuration management tooling (Ansible, Chef, Puppet, or similar).
- SRE, DevOps, or platform engineering background at a startup.
- Experience at an early-stage company.
**Apply:** https://hotfix.jobs/jobs/software-engineer-ml-serving-at-unusual-a766b5cb-314d-4dd8-8fd4-afa254e360dc
**Canonical:** https://hotfix.jobs/jobs/software-engineer-ml-serving-at-unusual-a766b5cb-314d-4dd8-8fd4-afa254e360dc