Skip to content

Research Engineer/Research Scientist, Audio

350k – 500kSan Francisco, CAHybrid7+ YOE
Summary

Research Engineer/Scientist role focused on advancing audio capabilities in large language models, including training speech/audio models, developing codecs, and building conversational AI systems. Requires strong experience in audio ML research and engineering with JAX or PyTorch.

About the role

Responsibilities

  • Work across the full stack of audio ML, developing audio codecs and representations
  • Source and synthesize high quality audio data
  • Train large-scale speech language models and large audio diffusion models
  • Develop novel architectures for incorporating continuous signals into LLMs
  • Build advanced steerable systems for end-to-end conversational systems, speech and audio understanding models, and speech synthesis capabilities
  • Collaborate with teams across pretraining, finetuning, reinforcement learning, production inference, and product

Requirements

  • Hands-on experience with training audio models (conversational speech-to-speech, speech translation, speech recognition, text-to-speech, diarization, codecs, or generative audio models)
  • Enjoy both research and engineering work with roughly 50/50 split
  • Comfortable working across abstraction levels from signal processing fundamentals to large-scale model training and inference optimization
  • Deep expertise with JAX, PyTorch, or large-scale distributed training; able to debug performance issues across the full stack
  • Thrive in fast-moving environments
  • Clear communication and effective collaboration skills
  • Passionate about building conversational AI that feels natural, steerable, and safe
  • Care about the societal impacts of voice AI

Nice-to-Haves

  • Large language model pretraining and finetuning
  • Training diffusion models for image and audio generation
  • Reinforcement learning for large language models and diffusion models
  • End-to-end system optimization, from performance benchmarking to kernel optimization
  • Experience with GPUs, Kubernetes, PyTorch, or distributed training infrastructure

Representative Projects

  • Training state-of-the-art neural audio codecs for 48 kHz stereo audio
  • Developing novel algorithms for diffusion pretraining and reinforcement learning
  • Scaling audio datasets to millions of hours of high quality audio
  • Creating robust evaluation methodologies for naturalness or expressiveness
  • Studying training dynamics of mixed audio-text language models
  • Optimizing latency and inference throughput for deployed streaming audio systems
Skills
JAXPyTorchAudio MLSpeech RecognitionText-to-SpeechDiffusion ModelsLarge Language ModelsDistributed TrainingSignal ProcessingKubernetes
Similar roles at this salary range
All ML Engineering jobs →
The Voleon Group

Senior Member of Research Staff, Optimization

Lead optimization research applying large-scale constrained optimization and ML to real-time trading decisions. Requires 5-10+ years experience, strong math/ML background, production coding skills, and PhD-level coursework.

300k – 325kBerkeley, CA +1ML EngineeringHybrid5+ YOEC++Python
Garner Health

Staff Machine Learning Operations Engineer

Staff MLOps Engineer responsible for the reliability, performance, and cost-efficiency of production ML systems. Architect ML platform with feature stores, model registries, and automated CI/CD pipelines.

298k – 351kNew York, NYML EngineeringHybrid7+ YOES3AWS
Anthropic

Staff+ Software Engineer, Inference Runtime

Technical lead for the shared, accelerator-agnostic inference runtime serving Claude. Owns architecture, performance, and validation for GPU/TPU/Trainium platforms in a high-scale distributed systems environment.

405k – 485kSan Francisco, CA +2ML EngineeringHybrid8+ YOEGPUTPU
OpenAI

Researcher, Agent Post-Training, Personality

Researcher on the Agent Post-Training Personality team shaping how frontier agents communicate, collaborate, and build trust. Focuses on turning qualitative behavioral insights into evals, training data, reward signals, and model improvements.

295k – 445kSan Francisco, CAML EngineeringOn-site7+ YOEHCILLMs
Square

Principal Engineer, AI Systems

Principal-level IC building and scaling production autonomous agents and agentic workflows across Block's ecosystem using frontier LLMs. Requires 15+ years experience shipping AI systems from zero to production scale.

319k – 479kUnited StatesML EngineeringRemote15+ YOERAGLLMs