Principal GenAI Platform Engineer (US)
Build and maintain GenAI platform infrastructure including model gateways, vector DBs, observability, and secure access controls for production LLM workloads.
Key Responsibilities
- Design, build, and maintain the core infrastructure layer supporting GenAI products, including model gateways, prompt/versioning stores, vector databases, and LLM evaluation tools.
- Implement secure access controls and authentication mechanisms integrated by default into the AI platform components.
- Develop and manage observability, monitoring, and logging solutions for GenAI workloads and infrastructure.
- Collaborate closely with product and engineering teams to integrate GenAI infrastructure with agent frameworks, and downstream applications.
- Optimize infrastructure for scalability, high availability, cost efficiency for production workloads.
Qualifications & Skills
- Extensive experience building and maintaining AI platform infrastructure, Kubernetes, and container security.
- Demonstrated expertise in observability, and monitoring frameworks, with a focus on real-time performance (e.g., experience with OpenTelemetry, MLFlow).
- Experience with AI infrastructure components such as vector databases, prompt/versioning stores, and AI IDEs.
Preferred Experience
- Familiarity with vLLM, SGLang or similar framework to host LLM inference workloads.
- Experience with CI/CD pipelines and automation for AI model deployment and platform operations.
- Strong knowledge of authentication and authorization frameworks integrated into AI platforms.
Senior Infrastructure Engineer
Build analytics infrastructure, observability tooling, and developer platforms to support real-time AI agents for 911 centers. Requires 4+ years infrastructure/platform/backend experience and comfort across the full stack.
Lead Site Reliability Engineer
Lead SRE driving reliability strategy, infrastructure architecture, observability, and incident response for a B2B fintech platform on AWS and Kubernetes. Requires 7+ years building production-grade distributed systems.
Senior Developer Experience Engineer
Senior Platform Engineer focused on Developer Experience building tools, automation, CI/CD systems, and AI tooling to improve developer productivity and workflows. Requires 7+ years cloud experience, containerization, and proficiency in Ruby, Go, or Python.