Machine Learning Engineer: LLM Interpretability & Systems

Develops systems for LLM interpretability and deterministic governance by working directly with model weights, activations, and architectures. Implements mechanistic interpretability techniques like activation patching and control vectors for enterprise policy enforcement in production.

175k – 250kSan Francisco, CAML EngineeringOnsite

Apply

About the role

What You Will Do

Take ideas from mechanistic interpretability and related work and turn them into code that runs in production, making research into reality.
Work directly with model internals to improve behavior and performance across commercial and open-source models.
Leverage techniques like activation patching, control vectors, and feature extraction to achieve targeted, repeatable improvements in model output.
Build the evaluation and deployment loops needed to ship changes reliably into enterprise environments.
Design and optimize the feature-level intervention systems that enable deterministic policy enforcement at inference time.

Who You Are

Strong understanding of Transformer architectures, PyTorch internals, and the mathematical foundations of deep learning.
Have trained, fine-tuned, or optimized models beyond superficial augmentation.
Can read a paper, decide what matters, and implement it.
Notice when something is not working and take ownership of fixing it.
Motivated by the challenge of making large language models reliable and controllable enough for the highest-stakes enterprise applications.

What We Offer

Compensation & Equity: Competitive base compensation, plus significant equity in a venture-backed company with institutional investors including Google’s Gradient Ventures, General Catalyst, and Y Combinator. We want people who think and act like owners. Real Impact: You will work directly on the core systems that determine how models perform in the wild. Your work ships into real, high-stakes environments where governance, auditability, and performance are non-negotiable. Autonomy & Trust: We operate with a high degree of trust. You are expected to form strong technical opinions and execute on them.

Skills

PyTorchTransformersMechanistic InterpretabilityActivation PatchingControl VectorsFeature ExtractionLLMsDeep LearningFine-TuningModel Optimization

Similar roles

ML Engineering jobs

Mirage

Software Engineer, Agents

Design and build agentic systems for AI-native video creation, integrating LLMs and evaluation frameworks to power creative workflows. Requires 5+ years building ML/agentic systems in production.

175k – 275kNew York, NYML EngineeringOn-site5+ YOERAGLLMs

Hedra

Research Engineer

Leads pre-training and post-training of action-conditioned world models and VLA models for physical AI applications. Requires PyTorch expertise, distributed training, and ML fundamentals; robotics background preferred.

175k – 275kSan Francisco, CAML EngineeringOn-siteFsdpVlms

Auctor

Software Engineer, Applied AI

Builds and improves core AI agent systems for retrieval, tool use, document understanding, and orchestration in production. Designs evals, analyzes traces, and iterates based on real enterprise workflows using Python and LLM expertise.

175k – 290kNew York, NYML EngineeringOn-siteLLMsEvals

Permitflow

Applied AI Engineer

Designs, deploys, and optimizes AI agents to automate construction permitting workflows. Builds backend/frontend services, APIs, data pipelines, and evaluation systems. Requires 3+ years in software/ML engineering with production AI experience.

175k – 250kNew York, NYML EngineeringHybrid3+ YOEAPIsAI Agents

CodeRabbit

Applied AI Engineer

Designs, builds, and deploys generative AI systems including LLM-based code reviews, agentic workflows, and RAG pipelines for developer productivity tools. Requires 3+ years in ML/LLM production systems, Python/TypeScript proficiency, and AI frameworks like LangChain.

175k – 275kSan Francisco, CAML EngineeringHybrid3+ YOERAGRLHF