Thinking Machines Lab
AI research lab building customizable multimodal AI systems
Thinking Machines Lab develops AI systems and platforms focused on multimodal models for research and production. They serve researchers, engineers, and enterprises by enabling customizable, collaborative AI through tools like Tinker for fine-tuning. Advances open science and efficiency in large-scale training and inference.
Network Engineer, Supercomputing
Own and debug multi-thousand-GPU network fabric (RDMA/RoCE, NVLink/NVSwitch) for large-scale AI training and inference. Requires backend language proficiency, large-scale cluster experience, and cross-stack ownership.
Assistant Controller
Own core accounting function and financial operations at an AI lab. Lead month-end close, financial reporting, internal controls, audit readiness, and build scalable accounting infrastructure.
Associate General Counsel, Corporate & Commercial
Lead corporate legal function handling equity financings, governance, and commercial contracting for an AI research lab. Requires 10+ years experience in both corporate transactions and tech commercial law.
Compensation Partner
Designs and implements compensation frameworks, advises on pay decisions for offers/promotions/retention, builds equity strategy, and communicates comp transparently to build trust. Requires 7+ years in high-growth environments with strong POV on comp philosophy.
Software Engineer, Data Infrastructure
Builds and scales data infrastructure for distributed training pipelines, multimodal data catalogs, and petabyte-scale processing systems. Collaborates with researchers using distributed systems like Spark, Ray, Kafka, and cloud data architectures. Requires backend proficiency in Python/Rust and bachelor's in CS/engineering.
Site Reliability Engineer (SRE)
Site Reliability Engineer drives end-to-end reliability for AI fine-tuning platform Tinker, including SLOs, monitoring, incident response, and multi-tenant GPU scheduling. Requires distributed systems experience, software proficiency for reliability, and production incident handling.
Research, Vision Expertise
Conducts research on visual perception, multimodal learning, and large-scale AI model training. Designs architectures, builds datasets and evaluations, and collaborates on frontier models. Requires ML expertise, Python proficiency, and experimental rigor.
Research Product Manager
Drives large-scale research products and programs in AI, coordinating cross-functional teams to translate technical ideas into scoped plans and integrate research into production systems. Requires CS/AI degree and experience in research or product management, thriving in technical, ambiguous environments.
Research, Pre-Training Science
Conducts research on pre-training methodologies for large AI models, develops new architectures and data strategies, runs large-scale experiments, and publishes findings. Requires strong ML fundamentals, Python proficiency, and experience with deep learning frameworks.
Research, Pre-Training Data
Designs and implements methods for sourcing, curating, and analyzing large-scale pre-training datasets for AI models, blending research with production-grade data engineering. Requires Python proficiency, deep learning frameworks, and strong ML fundamentals.
Research, Post-Training Data
Conducts post-training research for AI models, designing data collection strategies, developing labeling pipelines, modeling human preferences, and iterating on evaluations to improve model alignment, reasoning, and helpfulness. Requires strong Python skills, ML framework proficiency, and experimental rigor.
Research, Post-Training
Develops and tunes post-training recipes for AI models, iterates on evaluations, debugs configurations, scales methodologies, and publishes research to advance collaborative intelligence. Requires Python proficiency, deep learning frameworks, and strong ML fundamentals.
Research Engineer, Infrastructure, Training Systems
Designs and optimizes distributed training systems scaling across thousands of GPUs for large AI models. Requires strong systems engineering, PyTorch/JAX expertise, and collaborative mindset to boost research productivity.
Forward Deployed Engineer, Tinker
Forward Deployed Engineer triages and resolves customer issues for AI fine-tuning platform Tinker, builds tools and automation, writes documentation, and influences product roadmap based on direct customer feedback. Requires experience with LLM fine-tuning, debugging distributed systems, and backend languages like Python or Rust.
Research Engineer, Infrastructure, RL Systems
Designs and optimizes infrastructure for scalable reinforcement learning training of large models, improving reliability, observability, and throughput. Collaborates with researchers to productionize RL algorithms; requires strong engineering skills and deep learning framework knowledge.
Research Engineer, Infrastructure, Numerics
Designs and optimizes distributed training infrastructure for large-scale LLMs, focusing on low-precision numerics, kernel optimizations, and communication frameworks to enable stable, scalable trillion-parameter model training. Requires strong systems engineering, deep learning frameworks knowledge, and collaborative research mindset.
Research Engineer, Infrastructure, Kernels
Designs and optimizes high-performance ML kernels (CUDA, CuTe, Triton) for large-scale LLM training, focusing on GPU efficiency, low-precision formats, and distributed compute. Collaborates with researchers to bridge algorithms and hardware.
Research Engineer, Infrastructure, Inference
Designs, optimizes, and scales infrastructure for high-performance AI model inference, focusing on latency, throughput, efficiency, and reliability. Collaborates with researchers to enable production deployment of large-scale models using deep learning frameworks and distributed systems.
Research, Audio Expertise
Conducts research to advance audio capabilities in AI models, designing and training large-scale multimodal systems, building audio data pipelines, and publishing findings. Requires ML expertise, Python proficiency, and experience with deep learning frameworks.
Infrastructure Engineer, Security
Owns and evolves security infrastructure across compute, storage, networking, and data platforms for foundation models. Architects secure patterns, manages identities/secrets, builds threat models, and automates security checks in Kubernetes/cloud environments. Requires strong systems programming and infra experience.
HR Business Partner
Coaches managers on leadership, performance, and team dynamics while designing scalable people systems including compensation, career frameworks, and feedback processes for a high-growth AI research lab.
Engineering Manager
Leads a team of senior/staff engineers building scalable ML infrastructure and products, owning system design, reliability, and execution while contributing hands-on and hiring top talent. Requires 8+ years in production systems and 3+ years managing engineers.
Designer
Designs end-to-end AI product experiences, from concepts and prototypes to coded UI implementations, while collaborating on model training and contributing to design systems. Requires exceptional visual craft, coding ability to ship features, and interest in AI development.