AI Engineer (All Levels)
Designs and builds AI agents, workflows, and evaluation systems for enterprise audit platforms. Requires experience shipping production software with TypeScript, React, Python, Postgres, LLMs, RAG, and agent orchestration.
What You'll Own
Building and shipping AI agents
- Design, build, and ship agentic systems that automate and augment complex audit workflows
- Translate customer problems into concrete agent behaviors and workflows
- Integrate and orchestrate LLMs, tools, retrieval systems, and logic into cohesive, reliable agent experiences
- Own agents in production, including performance and observability
AI-native engineering execution
- Use AI as core leverage in how you design, build, test, and iterate
- Prototype quickly to resolve uncertainty, then harden systems for enterprise-grade reliability
- Build evaluations, feedback mechanisms, and guardrails so agents improve over time
- Design prompts, retrieval pipelines, and agent orchestration systems that perform reliably at scale
Product judgment and customer impact
- Make tradeoffs about what to build, what to cut, and what not to build at all
- Partner closely with Product and Design to define agent capabilities that drive real customer outcomes
- Stay deeply connected to how customers actually use agents and optimize for the highest impact problems
- Identify the highest-leverage capability gaps and unblock them without waiting for direction
Ownership of large product areas
- Take full ownership of large product areas rather than executing on narrow tasks
- Identify bottlenecks and unblock progress without waiting for direction
- Increase team velocity by building reusable abstractions, tools, and patterns for agent development
Who You Are
Strong software engineer with AI-native instincts, bias to building, strong product judgment, learning velocity, grounded optimism, and end-to-end thinking.
Experience
- Multiple years of experience shipping production software in complex, real-world systems
- Experience with TypeScript, React, Python, and Postgres
- Built and deployed LLM-powered features serving production traffic
- Designed retrieval pipelines and agent orchestration systems
- Implemented evaluation frameworks for model outputs and agent behaviors
- Worked with vector databases, embedding models, and RAG architectures
- Hands-on experience with modern LLM APIs (OpenAI, Gemini, Anthropic, etc.) and agent frameworks
- Comfort operating in ambiguity and taking responsibility for outcomes
- Deep empathy for professional-grade, mission-critical software
Benefits
- Competitive compensation packages with equity ownership
- Comprehensive health and wellness benefits
- Flexible time off and work schedules
- Technology reimbursements
- 401(k) plan
- Twice-yearly in-person offsites across the U.S.
Senior Machine Learning Operations Engineer
Build and operate Mercury's real-time ML inference platform for fraud risk decisioning. Own model deployment, observability, and lifecycle tooling with strong backend Python fundamentals.
AI Engineer, Evaluation
Design and implement evaluation frameworks and pipelines for AI systems using Evaluation-Driven Development. Build Python-based test suites, LLM graders, and measurement systems that guide prompt iteration and production deployment decisions.