Software Engineer

140k – 200kUnited StatesML EngineeringRemote2+ YOEJul 21

Summary

Builds and debugs AI agent infrastructure for healthcare automation, including prompt engineering, runtime issue tracing, evaluation datasets, simulation tooling, data pipelines, and observability dashboards. Requires 2-7 years experience with production LLMs/AI agents and TypeScript proficiency.

About the role

What You’ll Do

Work across our AI agent platform — writing prompts, debugging runtime issues, building agent simulation tooling, creating evals, interfacing with client data, and helping monitor system behavior at scale.
Trace and fix runtime bugs, then write regression tests.
Design evaluation datasets to simulate realistic workflows or red-team our system.
Build internal tooling for QA and agent simulation.
Normalize and transform messy client data for system integration.
Set up automatic testing and latency tracking infrastructure.
Create dashboards and observability tooling for agentic system behavior.
Expand on our existing eval & testing framework and agent simulation infrastructure.

Skills Required

Technical Skills

Proficiency in TypeScript
Strong generalist software engineering
Strong debugging skills (trace runtime failures, dig through logs, pinpoint issues in async or multi-step agent systems)
Data transformation and ingestion (build pipelines to normalize and convert unstructured data for AI systems)
Strong understanding of system design, including distributed systems and reliability/performance tradeoffs
Experience using modern AI coding tools (e.g. Cursor, GitHub Copilot, Claude)
Excellent documentation and testing discipline
Proficiency with Git

Soft Skills

Care about improving agent behavior
High agency; thrive with minimal structure
Comfortable getting in the weeds with details, edge cases, editing prompts, writing evals
Comfortable with ambiguity; work well with loose specs spanning prompts, code, RLHF
Learn fast and move fast; pattern-match from past systems work to LLM edge cases

Experience & Who Should Apply

2-7 years of experience working closely with LLMs or AI agents in production systems
Created internal tools or frameworks for QA, evals, or agent simulation
Contributed to fast-paced product cycles involving AI behavior, latency, user experience

Nice to Have

Experience with multi-agent systems, TTS/NLP pipelines, or structured output validation
Familiarity with testing frameworks, LangChain-style agent orchestration, or in-house eval harnesses
Experience with prompt engineering, LLM evals, and agent orchestration

Skills

TypeScriptLLMsAI agentsGitCursorGitHub CopilotClaudeLangChainprompt engineeringLLM evalsagent orchestrationdistributed systems

Similar roles at this salary range

All ML Engineering jobs →

Mozilla

Jun 19

Senior Machine Learning Engineer

Senior ML Engineer focused on fine-tuning and deploying LLMs and generative AI features into Firefox, emphasizing privacy, latency, and user experience.

139k – 218kUnited StatesML EngineeringRemote4+ YOERayLangChain

Mercury

Jun 18

Senior Machine Learning Operations Engineer

Build and operate Mercury's real-time ML inference platform for fraud risk decisioning. Own model deployment, observability, and lifecycle tooling with strong backend Python fundamentals.

167k – 208kSan Francisco, CA +2ML EngineeringHybrid5+ YOESQLSHAP

Distyl AI

Jun 18

AI Engineer, Evaluation

Design and implement evaluation frameworks and pipelines for AI systems using Evaluation-Driven Development. Build Python-based test suites, LLM graders, and measurement systems that guide prompt iteration and production deployment decisions.

150k – 250kSan Francisco, CA +1ML EngineeringHybrid2+ YOEPythonAI Systems

Grafana Labs

Jun 18

Senior AI Engineer

Senior Engineer building multi-agent AI systems, LLM integrations, and backend automation services that power Marketing Operations. Owns technical direction for agentic infrastructure connecting models to business systems.

154k – 185kUnited StatesML EngineeringRemote8+ YOERAGGit

Nuro

Jun 16

Software Engineer, ML Infrastructure

Build and scale ML infrastructure platform for autonomous vehicle development, focusing on automated resource provisioning, high-performance workload scheduling, and petabyte-scale data processing pipelines.

160k – 241kMountain View, CAML EngineeringOn-site3+ YOERaySlurm

Apply