# Member of Technical Staff — Audio and Voice AI
**Company:** [Stuut](https://hotfix.jobs/companies/stuut)
**Location:** San Francisco, CA, New York, NY
**Salary:** $220K-$320K
**Experience:** 5+ years
**Skills:** Python, PyTorch, TensorFlow, Transformers, Speech-To-Text, Text-To-Speech, Audio Processing, Multimodal Models, Llmops, MLOps
**Posted:** 2026-06-17
> Design, build, and deploy production-grade voice and audio AI systems including real-time agents and speech-driven workflows for financial operations. Requires 5+ years engineering experience with focus on applied AI/ML or speech systems.
## Job Description
## What You’ll Do
- Build & Deploy Voice AI Systems: design and ship production-ready audio and voice-based AI features, including real-time voice agents and speech-driven workflows.
- Craft High-Quality Voice UX: use modern speech-to-text, text-to-speech, and conversational AI platforms to create natural, responsive, and emotionally aware voice experiences tailored to financial use cases.
- Adapt & Fine-Tune Audio and Multimodal Models: fine-tune and optimize speech, audio, and LLM-based models for accuracy, latency, and reliability in real-world environments.
- Engineer Real-Time, Scalable AI Pipelines: build end-to-end AI/ML pipelines spanning audio ingestion, streaming inference, orchestration, and monitoring with enterprise-grade availability and performance.
- Establish Evaluation & Monitoring Frameworks (LLMOps): design rigorous evaluation systems to measure quality, latency, accuracy, drift, and business outcomes for voice and text-based AI systems.
- Automate Financial Workflows via Voice: develop AI-powered voice automations that reduce manual effort in collections, reconciliation, and customer communication.
- Collaborate Cross-Functionally: partner with Product, Engineering, Design, and customers to translate business needs into effective, user-centered voice AI solutions.
- Measure & Communicate Impact: define success metrics and continuously improve AI systems based on real-world usage and customer feedback.

## You Might Be a Fit If You…
- Have 5+ years of software engineering experience, with 2+ years focused on applied AI/ML, speech, or audio systems in production.
- Have built and shipped voice, audio, or conversational AI systems used by real customers.
- Have experience with speech-to-text, text-to-speech, audio processing, or multimodal models.
- Have integrated and fine-tuned LLMs for conversational or agent-based systems.
- Understand LLMOps / MLOps best practices, including deployment pipelines, monitoring, evaluation, and A/B testing.
- Are fluent in Python and experienced with PyTorch, TensorFlow, Transformers, or audio ML frameworks.
- Have built real-time or low-latency systems and understand the tradeoffs involved.
- Can translate business and UX requirements into robust, scalable AI solutions.
- Have experience integrating AI systems into existing enterprise or SaaS platforms.
- Enjoy working on ambiguous problems where product definition, UX, and engineering meet.

## Compensation
Top-of-market salary and equity package

## Benefits (for U.S.-based full-time employees)
- Medical, dental & vision insurance coverage for you
- 401(k) & Match
- Equity
- Flexible PTO
- Parental Leave
**Apply:** https://hotfix.jobs/jobs/member-of-technical-staff-audio-and-voice-ai-at-stuut-be841ac6-7856-43df-b21f-0540dd7e0f37
**Canonical:** https://hotfix.jobs/jobs/member-of-technical-staff-audio-and-voice-ai-at-stuut-be841ac6-7856-43df-b21f-0540dd7e0f37