Principal Engineer, AI Systems

319k – 479kUnited StatesRemote15+ YOEJun 10

Summary

Principal-level IC building and scaling production autonomous agents and agentic workflows across Block's ecosystem using frontier LLMs. Requires 15+ years experience shipping AI systems from zero to production scale.

About the role

Responsibilities

Ship, architect, build, and own end-to-end delivery of autonomous agents and agentic workflows that deliver real business value for Block, ensuring reliability, safety, and performance at scale
Design agent orchestration systems including planning, tool use, memory, evaluation, and multi-agent coordination at production scale
Integrate and optimize frontier LLMs into agent architectures, making decisions on model selection, fine-tuning, prompt engineering, and context retrieval strategies
Drive deeper model optimization work (fine-tuning, distillation, RLHF) where it unlocks agent capability or efficiency
Lead detailed technical planning by breaking down ambitious objectives into concrete, sequenced tasks with clear ownership and execution paths
Provide technical mentorship and guidance to engineers across experience levels, elevating team capabilities through code review, pairing, and knowledge sharing
Partner closely with technical and non-technical stakeholders to translate business objectives into agent-powered product experiences
Keep Block at the frontier by continuously evaluating emerging AI capabilities and making pragmatic tradeoffs across model performance, latency, cost, and user experience
Foster a culture of technical excellence, high-quality delivery, rapid experimentation, and learning within your team and beyond

Requirements

15+ years of experience in software engineering or machine learning, with recent professional experience building autonomous agents or agentic workflows in production
Deep experience building autonomous agents or agentic workflows in production environments—not just prototypes or demos
Fluency in the core primitives of agentic systems: context management, planning, tool use, memory, evaluation, and multi-step reasoning
Experience bringing frontier LLM capabilities into production products, with hands-on experience in prompt engineering, retrieval-augmented generation, and model optimization
A track record of taking AI-powered products from zero to scale in fast-paced, product-driven environments, with the judgment that comes from operating in production
Strong software engineering fundamentals with the ability to write production-quality code and make sound architectural decisions
Experience providing technical leadership within teams—you've shaped technical direction, driven execution, and elevated others
Product-minded engineering approach—you think in terms of user outcomes, not just model metrics
Excellent collaboration and communication skills, with ability to build alignment across engineering, product, and design
Comfort navigating extreme ambiguity in a domain that's evolving weekly
Alignment with Block's mission of economic empowerment and using technology to create access and opportunity

Nice-to-Haves

Experience at leading AI organizations with a track record of translating research into production agent systems
Background building or scaling agentic products at startups (including early-stage or pivoting companies)
Experience with model fine-tuning, distillation, or RLHF to improve agent performance
Familiarity with agent evaluation, safety, and alignment challenges in production contexts

Skills

PythonLLMsPrompt EngineeringRAGFine-tuningRLHFAgent OrchestrationMulti-agent SystemsModel OptimizationProduction AI Systems

Similar roles at this salary range

All ML Engineering jobs →

OpenAI

Jun 10

Researcher, Agent Post-Training, Personality

Researcher on the Agent Post-Training Personality team shaping how frontier agents communicate, collaborate, and build trust. Focuses on turning qualitative behavioral insights into evals, training data, reward signals, and model improvements.

295k – 445kSan Francisco, CAML EngineeringOn-site7+ YOEHCILLMs

Anthropic

Jun 9

Software Engineer, Safeguards Evals

Build evaluation infrastructure and datasets to measure how well AI agents detect misuse and policy violations. Design experiments, productionize evals into release pipelines, and improve safety investigation capabilities.

320k – 485kSan Francisco, CA +1ML EngineeringHybrid6+ YOELLMsPython

Headway

Jun 9

Staff Software Engineer

Staff-level engineer building LLM/ML systems for clinical documentation review, risk detection in healthcare claims, and provider-patient matching at a mental healthcare platform.

264k – 330kNew York, NYML EngineeringRemote7+ YOEAWSLLMs

Anthropic

Jun 8

Staff Software Engineer, Inference

Build and maintain distributed inference systems serving Claude to millions of users. Design intelligent routing, autoscaling, and high-performance infrastructure across diverse AI accelerators.

320k – 485kSan Francisco, CA +2ML EngineeringHybrid7+ YOEAWSGCP

OpenAI

Jun 5

Researcher: Agent Post-Training, API & Power-Users

Improve agentic model capabilities for API and power users by designing experiments, building evals from real workflows, and driving post-training interventions from discovery through launch.

295k – 445kSan Francisco, CAML EngineeringHybrid7+ YOERLLLMs

Apply