Machine Learning Researcher, Multimodal LLMs

Develops next-generation multimodal LLMs integrating speech, text, tools, and real-time reasoning for conversational AI agents. Requires strong background in LLMs, multimodal models, fast experimentation, and production deployment experience.

140k – 250kSan Francisco, CAAI ResearchRemote

Apply

About the role

Responsibilities

Contribute to the development of next-generation multimodal LLM stack, combining speech, text, tools, and real-time reasoning into a single unified system.
Build industry-leading conversational AI models that power Bland's agent, taking them from idea to production.
Define how agents listen, think, and act in real time, integrating streaming audio, tool execution, and dynamic context.

Requirements

Strong LLM / Multimodal Background: Experience with LLMs, multimodal models, or speech-language systems. Deep understanding of prompting, fine-tuning, and alignment techniques. Familiarity with neural audio codecs and modern multimodal LLM techniques.
Fast Experimental Loop: Ability to go from idea → dataset → experiment → conclusion in days. Design experiments that answer questions.
Product Intuition: Strong sense for natural vs robotic interactions. Translate abstract modeling ideas into user-facing improvements.
Builder Mentality: Take ownership from research through deployment. Thrive in ambiguous, fast-moving environments. Care about impact over elegance.
Think in systems, obsess over latency, correctness, and real-world behavior. Comfortable discarding ideas quickly. Push toward simple abstractions.

Nice-to-Haves (Bonus Points)

Experience with real-time voice systems or conversational AI.
Background in tool-using agents or agent frameworks.
Experience with multimodal datasets (audio + text + actions).
Contributions to LLM or speech-related research or open source.

Compensation & Benefits

Competitive salary: $180,000 – $260,000
Meaningful equity
Full healthcare, dental, vision

Skills

LLMsMultimodal ModelsSpeech-Language SystemsPromptingFine-TuningAlignment TechniquesNeural Audio CodecsConversational AIReal-Time Voice SystemsTool-Using Agents

Similar roles

AI Research jobs

Bland AI

Copy of Machine Learning Researcher, Audio

Conducts foundational research and develops scalable ML models for speech-to-text, text-to-speech, and neural audio codecs in real-time voice AI agents. Requires deep expertise in voice modeling, self-supervised learning, and production deployment at enterprise scale.

140k – 250kSan Francisco, CAAI ResearchRemoteTtsStt

Labelbox

Forward Deployed Research Scientist

Forward Deployed Research Scientist collaborates with frontier AI labs on data strategies, fine-tunes open-weight LLMs, runs ablation studies, and validates data impact for client projects. Requires MS/PhD in ML/NLP/CS, hands-on LLM fine-tuning, and fast-paced experimental rigor.

140k – 200kSan Francisco, CAAI ResearchHybridDpoLLMs

Astera

Research Scientist - Simplex

Develops theories of intelligence grounded in neural network internal structures, focusing on belief geometries in LLMs and biological brains. Conducts experiments bridging mathematics, ML interpretability, and safety research; requires PhD-level quantitative depth and hands-on coding.

140k – 200kEmeryville, CAAI ResearchOn-siteLLMsPyTorch

Datadog

AI Research Engineer – Datadog AI Research (DAIR)

Builds ML infrastructure and tooling to productionize AI research in observability models, SRE agents, and code repair. Requires strong Python/ML systems expertise, distributed computing experience, and proficiency in PyTorch/JAX.

140k – 400kNew York, NYAI ResearchOn-siteGoJAX

Datadog

AI Research Scientist – Datadog AI Research (DAIR)

Conducts cutting-edge research in Generative AI, building foundation models and autonomous agents for cloud observability, SRE, and code repair. Requires PhD in ML or related field, publications at top conferences, and expertise in PyTorch/TensorFlow distributed training.

140k – 400kNew York, NYAI ResearchOn-siteCUDAPyTorch