Applied AI Engineer, Codex Core Agent
Develops and improves Codex AI agents for real-world software engineering tasks, focusing on performance, reliability, and integration with research and product teams. Requires strong Python, ML/LLM experience, and skills in evaluation, prompting, and debugging production failures.
What You’ll Do
- Design and iterate on agent behaviors across real-world coding tasks and long-horizon workflows.
- Work closely with research to develop and run evals to measure agent performance, regressions, failure modes, and edge cases.
- Improve performance through prompting, tool-use strategies, context construction, and model-facing experimentation.
- Analyze failures in production and systematically improve robustness and reliability.
- Build feedback loops and data systems that get better real-task data into evaluation and research.
- Work with product teams to shape user-facing agent experiences and the interfaces the agent depends on.
- Help define what “good” looks like for agents completing complex tasks end-to-end.
You Might Be a Good Fit If You
- Have experience building or shipping machine learning or LLM-powered products.
- Are strong in Python and comfortable with modern ML tooling.
- Have worked on model evaluation, fine-tuning, or prompt design.
- Think in terms of systems and user outcomes, not just model metrics.
- Enjoy debugging messy, real-world failures and turning them into improvements.
- Want to work in the layer that turns research and model potential into systems that actually work for users.
Bonus Points
- Experience with agent frameworks or tool-using LLM systems.
- Research experience with code generation models or developer tooling.
- Experience working with large, messy datasets or production logs.
Staff Software Engineer, AI Runtime
Staff Software Engineer building and scaling Databricks' managed large-scale GPU training platform (AIR). Focus on distributed training performance, scheduling, fault tolerance, and developer experience for thousands of accelerators.
Senior Staff Machine Learning Engineer, Communication & Connectivity
Lead ML architecture and implementation for Airbnb's Messaging & Notifications, building recommendation engines, ranking systems, and LLM-powered experiences while mentoring engineers.
Staff Software Engineer
Founding Staff Applied Agent Engineer to architect and lead Traba's agentic platform, building production LLM/agent systems that integrate with customer WMS/TMS/ERP and drive industrial operations. Requires 7+ years engineering experience with 2+ years building production agent systems.
Senior Software Engineer
Founding Senior Applied Agent Engineer building production LLM agent systems that automate supply chain workflows. Requires 5+ years engineering experience with 1+ year shipping LLM/agent features, strong Python/TypeScript skills, and hands-on agent stack experience.
Staff Software Engineer, Cribl AI
Staff-level AI/ML engineer building and productionizing generative AI features across backend and frontend for Cribl's observability platform. Requires 6+ years experience, AI/ML and MLOps background, and TypeScript/JavaScript proficiency.