Skip to content

Software Engineer

140k – 200kUnited StatesML EngineeringRemote2+ YOE
Summary

Builds and debugs AI agent infrastructure for healthcare automation, including prompt engineering, runtime issue tracing, evaluation datasets, simulation tooling, data pipelines, and observability dashboards. Requires 2-7 years experience with production LLMs/AI agents and TypeScript proficiency.

About the role

What You’ll Do

  • Work across our AI agent platform — writing prompts, debugging runtime issues, building agent simulation tooling, creating evals, interfacing with client data, and helping monitor system behavior at scale.
  • Trace and fix runtime bugs, then write regression tests.
  • Design evaluation datasets to simulate realistic workflows or red-team our system.
  • Build internal tooling for QA and agent simulation.
  • Normalize and transform messy client data for system integration.
  • Set up automatic testing and latency tracking infrastructure.
  • Create dashboards and observability tooling for agentic system behavior.
  • Expand on our existing eval & testing framework and agent simulation infrastructure.

Skills Required

Technical Skills

  • Proficiency in TypeScript
  • Strong generalist software engineering
  • Strong debugging skills (trace runtime failures, dig through logs, pinpoint issues in async or multi-step agent systems)
  • Data transformation and ingestion (build pipelines to normalize and convert unstructured data for AI systems)
  • Strong understanding of system design, including distributed systems and reliability/performance tradeoffs
  • Experience using modern AI coding tools (e.g. Cursor, GitHub Copilot, Claude)
  • Excellent documentation and testing discipline
  • Proficiency with Git

Soft Skills

  • Care about improving agent behavior
  • High agency; thrive with minimal structure
  • Comfortable getting in the weeds with details, edge cases, editing prompts, writing evals
  • Comfortable with ambiguity; work well with loose specs spanning prompts, code, RLHF
  • Learn fast and move fast; pattern-match from past systems work to LLM edge cases

Experience & Who Should Apply

  • 2-7 years of experience working closely with LLMs or AI agents in production systems
  • Created internal tools or frameworks for QA, evals, or agent simulation
  • Contributed to fast-paced product cycles involving AI behavior, latency, user experience

Nice to Have

  • Experience with multi-agent systems, TTS/NLP pipelines, or structured output validation
  • Familiarity with testing frameworks, LangChain-style agent orchestration, or in-house eval harnesses
  • Experience with prompt engineering, LLM evals, and agent orchestration
Skills
TypeScriptLLMsAI agentsGitCursorGitHub CopilotClaudeLangChainprompt engineeringLLM evalsagent orchestrationdistributed systems
Similar roles at this salary range
All ML Engineering jobs →
Mozilla

Senior Machine Learning Engineer

Senior ML Engineer focused on fine-tuning and deploying LLMs and generative AI features into Firefox, emphasizing privacy, latency, and user experience.

139k – 218kUnited StatesML EngineeringRemote4+ YOERayLangChain
Mercury

Senior Machine Learning Operations Engineer

Build and operate Mercury's real-time ML inference platform for fraud risk decisioning. Own model deployment, observability, and lifecycle tooling with strong backend Python fundamentals.

167k – 208kSan Francisco, CA +2ML EngineeringHybrid5+ YOESQLSHAP
Distyl AI

AI Engineer, Evaluation

Design and implement evaluation frameworks and pipelines for AI systems using Evaluation-Driven Development. Build Python-based test suites, LLM graders, and measurement systems that guide prompt iteration and production deployment decisions.

150k – 250kSan Francisco, CA +1ML EngineeringHybrid2+ YOEPythonAI Systems
Grafana Labs

Senior AI Engineer

Senior Engineer building multi-agent AI systems, LLM integrations, and backend automation services that power Marketing Operations. Owns technical direction for agentic infrastructure connecting models to business systems.

154k – 185kUnited StatesML EngineeringRemote8+ YOERAGGit
Nuro

Software Engineer, ML Infrastructure

Build and scale ML infrastructure platform for autonomous vehicle development, focusing on automated resource provisioning, high-performance workload scheduling, and petabyte-scale data processing pipelines.

160k – 241kMountain View, CAML EngineeringOn-site3+ YOERaySlurm