Skip to content

Senior Software Engineer - Domestic Wires

Senior engineer building and scaling Mercury's LLM-powered financial assistant Command. Owns full-stack AI product development from system prompts and agentic workflows to eval infrastructure and production reliability.

201k – 251kSan Francisco, CANew York, NYPortland, ORML EngineeringHybrid7+ YOE

About the role

What you'll do

Ship new capabilities users love:

  • Design and ship new Command skills, the domain-specific instruction sets that teach the model how to handle workflows like sending money, managing invoices, and understanding cash flow
  • Design and build agentic workflows in Command, defining the architecture for how multi-step agent interactions should work as we extend what the product can do on a customer's behalf
  • Work with backend teams to define tool schemas for new capabilities, shaping the data contracts between Mercury's business logic and the model
  • Own new capabilities end to end, from the system prompt to the frontend component that renders the response

Own the LLM layer:

  • Maintain and evolve Command's prompt architecture: the system prompt, skill loading system, session context, and the policy and compliance layers underneath
  • Tune model behavior: reasoning effort, prompt caching strategy, fallback chains, and the streaming patterns that make the product feel fast
  • Stay current with how models are evolving and bring that knowledge back to how Command is built

Build quality in:

  • Write and expand Command's eval harness, adding cases that cover new capabilities and scoring rubrics that detect regressions before users do
  • Partner with product and compliance teams to define what "working correctly" means for each new capability, then build the tests that prove it
  • Own the reliability and quality of what you ship, from initial design through post-launch monitoring

The ideal candidate

  • Has 7 or more years of software engineering experience, with deep technical expertise building and scaling LLM-powered applications in production
  • Has gone beyond shipping a first version: you have scaled an LLM-powered product, dealt with the reliability and performance problems that come with real usage, and made it better over time
  • Has experience designing agentic systems and has opinions about how to architect multi-step workflows that are reliable, explainable, and safe to run on behalf of real users
  • Has built eval infrastructure and can write cases that actually measure whether the product works, not just whether the model outputs something plausible
  • Understands the real tradeoffs in LLM deployments: latency, cost, compliance, and what breaks in production that doesn't show up in demos
  • Has opinions about what makes an AI product trustworthy, not just impressive, and can build toward that bar
  • Is comfortable with TypeScript and willing to learn Haskell for backend tool work, or already comfortable with both
  • Can work across the full stack of an AI product, from the system prompt to the streaming frontend
  • Has a track record of mentoring engineers and raising the technical bar of their team

Skills

TypeScriptHaskellLLMsPrompt EngineeringAgentic WorkflowsEval InfrastructureSystem PromptsStreaming FrontendTool SchemasModel Tuning

Similar roles

ML Engineering jobs

Senior Machine Learning Engineer II, Fulfillment, Matching and Positioning

As a Senior Machine Learning Engineer II, you will design, implement, and deploy ML and optimization solutions for Instacart's fulfillment system, focusing on real-time decisioning for order batching, shopper routing, and assignment. You will own the full model lifecycle and collaborate with cross-functional partners.

201k – 254kUnited StatesML EngineeringRemote5+ YOEGoSQL

Senior Machine Learning Engineer II, Ads Response Prediction

Lead research on pCTR and conversion models for Instacart Ads. Tackle bias mitigation, calibration, multi-task learning, and generative retrieval systems. Requires 6+ years ML experience and advanced degree.

201k – 254kUnited StatesML EngineeringRemote6+ YOEJAXSQL

Senior Applied Scientist II, Ads Optimization

Leads algorithmic development for Instacart's ads optimization systems, including real-time bidding, budget pacing, and auction mechanics using constrained optimization and control theory. Requires MS/PhD, 8+ years experience deploying production systems at scale, and proficiency in Go/Java/C++/Python.

201k – 254kUnited StatesML EngineeringRemote8+ YOEGoC++

Senior Software Engineer, AI/LLM

Builds and optimizes LLM-driven features, agentic workflows, and proprietary AI models for Bubble's visual app development platform. Requires Master's/PhD + 2+ years or 5+ years ML/software experience with transformers, RAG, and AI tools.

201k – 261kNew York, NYML EngineeringHybrid2+ YOERAGLLMs

Senior Software Engineer

Build and own production AI agent systems (harnesses, evals, orchestration) on frontier LLMs for industrial supply chain workflows. Requires 5+ years software engineering with 1+ year shipping LLM/agent features, strong Python/TS, and high-agency customer immersion.

200k – 240kNew York, NY +1ML EngineeringOn-site5+ YOEWmsTms