# Machine Learning Researcher, Multimodal LLMs
**Company:** [Bland AI](https://hotfix.jobs/companies/bland-ai)
**Location:** Remote
**Salary:** $140K-$250K
**Skills:** LLMs, Multimodal Models, Speech-Language Systems, Prompting, Fine-Tuning, Alignment Techniques, Neural Audio Codecs, Conversational AI, Real-Time Voice Systems, Tool-Using Agents
**Posted:** 2026-04-21
> Develops next-generation multimodal LLMs integrating speech, text, tools, and real-time reasoning for conversational AI agents. Requires strong background in LLMs, multimodal models, fast experimentation, and production deployment experience.
## Job Description
## Responsibilities
- Contribute to the development of next-generation multimodal LLM stack, combining speech, text, tools, and real-time reasoning into a single unified system.
- Build industry-leading conversational AI models that power Bland's agent, taking them from idea to production.
- Define how agents listen, think, and act in real time, integrating streaming audio, tool execution, and dynamic context.

## Requirements
- **Strong LLM / Multimodal Background**: Experience with LLMs, multimodal models, or speech-language systems. Deep understanding of prompting, fine-tuning, and alignment techniques. Familiarity with neural audio codecs and modern multimodal LLM techniques.
- **Fast Experimental Loop**: Ability to go from idea → dataset → experiment → conclusion in days. Design experiments that answer questions.
- **Product Intuition**: Strong sense for natural vs robotic interactions. Translate abstract modeling ideas into user-facing improvements.
- **Builder Mentality**: Take ownership from research through deployment. Thrive in ambiguous, fast-moving environments. Care about impact over elegance.
- Think in systems, obsess over latency, correctness, and real-world behavior. Comfortable discarding ideas quickly. Push toward simple abstractions.

## Nice-to-Haves (Bonus Points)
- Experience with real-time voice systems or conversational AI.
- Background in tool-using agents or agent frameworks.
- Experience with multimodal datasets (audio + text + actions).
- Contributions to LLM or speech-related research or open source.

## Compensation & Benefits
- Competitive salary: $180,000 – $260,000
- Meaningful equity
- Full healthcare, dental, vision
**Apply:** https://hotfix.jobs/jobs/machine-learning-researcher-multimodal-llms-at-bland-ai-0d857e5b-47e0-4856-b495-99a8d9047bd9
**Canonical:** https://hotfix.jobs/jobs/machine-learning-researcher-multimodal-llms-at-bland-ai-0d857e5b-47e0-4856-b495-99a8d9047bd9