Skip to content

Research Engineer, Interpretability

Research Engineer focused on mechanistic interpretability, building tools and infrastructure to reverse-engineer neural networks for safer AI. Requires 5+ years software experience, Python proficiency, and AI research contributions.

315k – 560kSan Francisco, CAAI ResearchHybrid5+ YOE

About the role

Responsibilities

  • Implement and analyze research experiments, both quickly in toy scenarios and at scale in large models
  • Set up and optimize research workflows to run efficiently and reliably at large scale
  • Build tools and abstractions to support rapid pace of research experimentation
  • Develop and improve tools and infrastructure to support other teams in using Interpretability’s work to improve model safety

You may be a good fit if you

  • Have 5-10+ years of experience building software
  • Are highly proficient in at least one programming language (e.g., Python, Rust, Go, Java) and productive with python
  • Have some experience contributing to empirical AI research projects
  • Have a strong ability to prioritize and direct effort toward the most impactful work and are comfortable operating with ambiguity and questioning assumptions
  • Prefer fast-moving collaborative projects to extensive solo efforts
  • Want to learn more about machine learning research and its applications and collaborate closely with researchers
  • Care about the societal impacts and ethics of your work

Strong candidates may also have experience with

  • Designing a code base so that anyone can quickly code experiments, launch them, and analyze their results without hitting bugs
  • Optimizing the performance of large-scale distributed systems
  • Collaborating closely with researchers
  • Language modeling with transformers
  • GPUs or Pytorch

Representative Projects

  • Building Garcon, a tool that allows researchers to easily access LLMs internals from a jupyter notebook
  • Setting up and optimizing a pipeline to efficiently collect petabytes of transformer activations and shuffle them
  • Profiling and optimizing ML training, including parallelizing to many GPUs
  • Make launching ML experiments and manipulating+analyzing the results fast and easy
  • Creating an interactive visualization of attention between tokens in a language model

Skills

PythonRustGoJavaPyTorchTransformersGpusJupyterDistributed SystemsMachine Learning

Similar roles

AI Research jobs

Research Engineer, Rule of Law

Conduct technical and sociotechnical research at the intersection of AI and democratic institutions, focusing on legal alignment, institutional analysis, and AI applications to support civic life and accountable government. Requires deep AI expertise plus substantive knowledge in law, government, or public policy.

320k – 485kSan Francisco, CA +1AI ResearchHybrid5+ YOEGovtechFine-Tuning

Research Scientist, Life Sciences

Anthropic is seeking a Research Scientist to join their Life Sciences team. This role involves building and shipping agentic tools, designing evaluation benchmarks, and partnering with external users to improve model capabilities on scientific tasks.

300k – 320kSan Francisco, CAAI ResearchHybrid5+ YOELLMsRLHF

Transformative AI Research Economist, Economic Research 

Builds macroeconomic models and scenario-based forecasting tools for transformative AI impacts on growth, labor markets, and income distribution. Requires PhD in Economics, expertise in macro modeling, computational methods, and grounding in real-world AI usage data.

300k – 405kSan Francisco, CAAI ResearchHybridLLMsJulia

Researcher, Misalignment Research

Designs worst-case demonstrations and adversarial evaluations to uncover AGI misalignment risks like deception and power-seeking. Builds automated stress-testing infrastructure and researches alignment failure modes to inform OpenAI's safety strategy. Requires 4+ years in AI red-teaming or adversarial ML.

295k – 445kSan Francisco, CAAI ResearchOn-site4+ YOELLMsAi Safety

Researcher, Loss of Control

Designs and implements mitigation stacks to prevent loss of control risks in frontier AI models, including prevention, monitoring, detection, and enforcement. Requires expertise in deep learning, transformers, PyTorch/TensorFlow, and AI safety research.

295k – 445kSan Francisco, CAAI ResearchOn-siteLLMsPyTorch