Skip to content

Principal Engineer, AI Systems

319k – 479kUnited StatesRemote15+ YOE
Summary

Principal-level IC building and scaling production autonomous agents and agentic workflows across Block's ecosystem using frontier LLMs. Requires 15+ years experience shipping AI systems from zero to production scale.

About the role

Responsibilities

  • Ship, architect, build, and own end-to-end delivery of autonomous agents and agentic workflows that deliver real business value for Block, ensuring reliability, safety, and performance at scale
  • Design agent orchestration systems including planning, tool use, memory, evaluation, and multi-agent coordination at production scale
  • Integrate and optimize frontier LLMs into agent architectures, making decisions on model selection, fine-tuning, prompt engineering, and context retrieval strategies
  • Drive deeper model optimization work (fine-tuning, distillation, RLHF) where it unlocks agent capability or efficiency
  • Lead detailed technical planning by breaking down ambitious objectives into concrete, sequenced tasks with clear ownership and execution paths
  • Provide technical mentorship and guidance to engineers across experience levels, elevating team capabilities through code review, pairing, and knowledge sharing
  • Partner closely with technical and non-technical stakeholders to translate business objectives into agent-powered product experiences
  • Keep Block at the frontier by continuously evaluating emerging AI capabilities and making pragmatic tradeoffs across model performance, latency, cost, and user experience
  • Foster a culture of technical excellence, high-quality delivery, rapid experimentation, and learning within your team and beyond

Requirements

  • 15+ years of experience in software engineering or machine learning, with recent professional experience building autonomous agents or agentic workflows in production
  • Deep experience building autonomous agents or agentic workflows in production environments—not just prototypes or demos
  • Fluency in the core primitives of agentic systems: context management, planning, tool use, memory, evaluation, and multi-step reasoning
  • Experience bringing frontier LLM capabilities into production products, with hands-on experience in prompt engineering, retrieval-augmented generation, and model optimization
  • A track record of taking AI-powered products from zero to scale in fast-paced, product-driven environments, with the judgment that comes from operating in production
  • Strong software engineering fundamentals with the ability to write production-quality code and make sound architectural decisions
  • Experience providing technical leadership within teams—you've shaped technical direction, driven execution, and elevated others
  • Product-minded engineering approach—you think in terms of user outcomes, not just model metrics
  • Excellent collaboration and communication skills, with ability to build alignment across engineering, product, and design
  • Comfort navigating extreme ambiguity in a domain that's evolving weekly
  • Alignment with Block's mission of economic empowerment and using technology to create access and opportunity

Nice-to-Haves

  • Experience at leading AI organizations with a track record of translating research into production agent systems
  • Background building or scaling agentic products at startups (including early-stage or pivoting companies)
  • Experience with model fine-tuning, distillation, or RLHF to improve agent performance
  • Familiarity with agent evaluation, safety, and alignment challenges in production contexts
Skills
PythonLLMsPrompt EngineeringRAGFine-tuningRLHFAgent OrchestrationMulti-agent SystemsModel OptimizationProduction AI Systems
Similar roles at this salary range
All ML Engineering jobs →
OpenAI

Researcher, Agent Post-Training, Personality

Researcher on the Agent Post-Training Personality team shaping how frontier agents communicate, collaborate, and build trust. Focuses on turning qualitative behavioral insights into evals, training data, reward signals, and model improvements.

295k – 445kSan Francisco, CAML EngineeringOn-site7+ YOEHCILLMs
Anthropic

Software Engineer, Safeguards Evals

Build evaluation infrastructure and datasets to measure how well AI agents detect misuse and policy violations. Design experiments, productionize evals into release pipelines, and improve safety investigation capabilities.

320k – 485kSan Francisco, CA +1ML EngineeringHybrid6+ YOELLMsPython
Headway

Staff Software Engineer

Staff-level engineer building LLM/ML systems for clinical documentation review, risk detection in healthcare claims, and provider-patient matching at a mental healthcare platform.

264k – 330kNew York, NYML EngineeringRemote7+ YOEAWSLLMs
Anthropic

Staff Software Engineer, Inference

Build and maintain distributed inference systems serving Claude to millions of users. Design intelligent routing, autoscaling, and high-performance infrastructure across diverse AI accelerators.

320k – 485kSan Francisco, CA +2ML EngineeringHybrid7+ YOEAWSGCP
OpenAI

Researcher: Agent Post-Training, API & Power-Users

Improve agentic model capabilities for API and power users by designing experiments, building evals from real workflows, and driving post-training interventions from discovery through launch.

295k – 445kSan Francisco, CAML EngineeringHybrid7+ YOERLLLMs