Senior Software Engineer, AI Platform & Agents
Builds and maintains AI platform infrastructure, SDKs, and services for LLMs and agentic workflows. Collaborates with scientists and engineers on RAG pipelines, production AI systems, and observability; requires 5+ years experience with Python/Go and cloud-native principles.
Responsibilities
AI Platform Engineering: Build and maintain the core infrastructure, services, and SDKs that empower the rest of the company to leverage LLMs and Agentic workflows.
Scientific Collaboration: Partner closely with Applied Scientists and Data Scientists to translate experimental models and research into scalable, production-grade AI services.
Data Strategy: Work with Data Engineers to design robust data pipelines that feed into RAG systems, ensuring high-quality, real-time data retrieval for AI context.
Internal Onboarding: Act as a technical consultant and facilitator for other engineering teams, helping them successfully onboard their specific AI use cases onto the platform.
Design & Document: Design, document, and deploy public AI interfaces that provide an exceptional developer experience for internal users.
Operational Excellence: Collaborate with Site Reliability Engineers (SRE) to deliver AI applications in a repeatable, stable, and highly efficient software development lifecycle.
Quality Engineering: Develop comprehensive test cases for AI models and workflows, working with QE to ensure deterministic and reliable outcomes in a non-deterministic AI landscape.
Skills and Qualifications
Core Development: Expert-level API and service development in Python (required) and/or Go. Experience with SQL, database management.
AI Architecture: Hands-on experience building RAG pipelines, managing vector databases, and orchestrating multi-agent systems.
Cross-Functional Fluency: Proven ability to communicate technical requirements effectively across Data Science, Data Engineering, and Product domains.
Cloud Native Principles: Deep understanding of production-tested principles including horizontal scaling, 12-factor application design, and security principles like OWASP.
Observability: Experience monitoring AI performance and analyzing model/software errors using tools like Sentry, DataDog, or Jaeger.
DevOps Tools: Proficiency in container-based development with Docker and source control via Git.
Experience: At least 5+ years of hands-on software development experience.
Bonus Points
- Experience with AI orchestration frameworks (e.g., LangChain, LangGraph, LlamaIndex, LiteLLM, or CrewAI).
- Production experience with distributed, event-driven, and message-driven architectures.
- Experience building self-service "Platform-as-a-Product" for internal developers.
Compensation
The US base salary for this position ranges from $150,000/year in our lowest geographic market up to $200,000/year in our highest geographic market.
Senior Software Engineer, Compute (Temporal Cloud)
Build and operate distributed systems and multi-tenant platform services for Temporal Cloud. Own SLOs, incident response, and production reliability for APIs and control/data planes.
Senior Software Engineer, Atlas Search Query
Lead complex search query architecture and optimization projects for MongoDB Atlas Search. Requires 5+ years in data management/search systems, distributed systems experience, and proficiency in Java and Rust.