Research Engineer/Research Scientist, Audio
Research Engineer/Scientist role focused on advancing audio capabilities in large language models, including training speech/audio models, developing codecs, and building conversational AI systems. Requires strong experience in audio ML research and engineering with JAX or PyTorch.
Responsibilities
- Work across the full stack of audio ML, developing audio codecs and representations
- Source and synthesize high quality audio data
- Train large-scale speech language models and large audio diffusion models
- Develop novel architectures for incorporating continuous signals into LLMs
- Build advanced steerable systems for end-to-end conversational systems, speech and audio understanding models, and speech synthesis capabilities
- Collaborate with teams across pretraining, finetuning, reinforcement learning, production inference, and product
Requirements
- Hands-on experience with training audio models (conversational speech-to-speech, speech translation, speech recognition, text-to-speech, diarization, codecs, or generative audio models)
- Enjoy both research and engineering work with roughly 50/50 split
- Comfortable working across abstraction levels from signal processing fundamentals to large-scale model training and inference optimization
- Deep expertise with JAX, PyTorch, or large-scale distributed training; able to debug performance issues across the full stack
- Thrive in fast-moving environments
- Clear communication and effective collaboration skills
- Passionate about building conversational AI that feels natural, steerable, and safe
- Care about the societal impacts of voice AI
Nice-to-Haves
- Large language model pretraining and finetuning
- Training diffusion models for image and audio generation
- Reinforcement learning for large language models and diffusion models
- End-to-end system optimization, from performance benchmarking to kernel optimization
- Experience with GPUs, Kubernetes, PyTorch, or distributed training infrastructure
Representative Projects
- Training state-of-the-art neural audio codecs for 48 kHz stereo audio
- Developing novel algorithms for diffusion pretraining and reinforcement learning
- Scaling audio datasets to millions of hours of high quality audio
- Creating robust evaluation methodologies for naturalness or expressiveness
- Studying training dynamics of mixed audio-text language models
- Optimizing latency and inference throughput for deployed streaming audio systems
Senior Member of Research Staff, Optimization
Lead optimization research applying large-scale constrained optimization and ML to real-time trading decisions. Requires 5-10+ years experience, strong math/ML background, production coding skills, and PhD-level coursework.
Staff Machine Learning Operations Engineer
Staff MLOps Engineer responsible for the reliability, performance, and cost-efficiency of production ML systems. Architect ML platform with feature stores, model registries, and automated CI/CD pipelines.
Staff+ Software Engineer, Inference Runtime
Technical lead for the shared, accelerator-agnostic inference runtime serving Claude. Owns architecture, performance, and validation for GPU/TPU/Trainium platforms in a high-scale distributed systems environment.
Researcher, Agent Post-Training, Personality
Researcher on the Agent Post-Training Personality team shaping how frontier agents communicate, collaborate, and build trust. Focuses on turning qualitative behavioral insights into evals, training data, reward signals, and model improvements.
Principal Engineer, AI Systems
Principal-level IC building and scaling production autonomous agents and agentic workflows across Block's ecosystem using frontier LLMs. Requires 15+ years experience shipping AI systems from zero to production scale.