Forward Deployed Engineer, RL Environments
Builds and maintains sandboxed, reproducible RL environments for AI agent training, including terminal, browser, and tool-augmented setups. Requires 2+ years Python/systems engineering, containerization, and RL concepts understanding.
What You’ll Do
- Design, build, and maintain sandboxed RL environments for agentic AI training—including terminal emulators, browser automation harnesses, computer-use simulators, and tool-augmented workspaces (e.g., environments built on frameworks like TerminalBench, OSWorld, and Tau-bench)
- Develop reproducible, containerized execution environments (Docker, VMs, lightweight sandboxes) that support deterministic task rollouts and reward signal collection
- Integrate with and extend open-source agentic tooling and custom CLI/API harnesses to enable multi-step agent interaction
- Build instrumentation and observability layers—structured logging, trajectory capture, state snapshotting—so training runs and human annotation sessions produce clean, auditable data
- Collaborate with data operations to design task curricula and evaluation protocols that stress-test model capabilities across environment types
- Own environment deployment and reliability: CI/CD pipelines, automated testing of environment configurations, and monitoring for drift or breakage across versions
- Rapidly prototype new environment types as client and internal requirements evolve, moving from spec to working system in days, not weeks
What We’re Looking For
Required
- 2+ years of professional software engineering experience, with strong fundamentals in Python and at least one systems-level language (Go, Rust, C++)
- Demonstrated experience with containerization and sandboxing (Docker, Podman, Firecracker, or similar) in production or near-production contexts
- Familiarity with RL concepts: MDPs, reward shaping, episode structure, observation/action spaces. You don’t need to have trained models, but you need to understand what an environment must provide to an RL training loop
- Experience building or maintaining developer tooling, CLI tools, or infrastructure automation
- Comfort working with browser automation frameworks or terminal interaction tooling
- Strong debugging instincts—you can trace failures across process boundaries, container layers, and network calls
- Ability to read and implement from academic papers and open-source benchmark repositories without extensive hand-holding
Preferred
- Direct experience building or contributing to RL environments (Gymnasium/Gym, PettingZoo, or custom environment implementations)
- Experience with agentic AI evaluation frameworks (SWE-bench, WebArena, OSWorld, TerminalBench, or similar)
- Familiarity with GCP or AWS infrastructure (Compute Engine, ECS/EKS, Cloud Build)
- Prior work at an AI data company, ML platform company, or AI research lab
- Contributions to open-source projects in the RL, agents, or dev-tools space
Compensation
Annual base salary range $140,000—$200,000 USD
Senior Software Engineer, AI Runtime
Senior Software Engineer building and scaling Databricks' managed GPU training platform (AI Runtime) for large-scale distributed AI model training. Requires 5+ years in distributed systems and hands-on experience with GPU training frameworks.
Sr. Machine Learning Engineer, Computer Vision
Build and prototype diffusion-based text-to-image generative models (Pinterest Canvas) using large-scale visual-text datasets. Requires 5+ years industry computer vision experience and an M.S. or Ph.D.
Senior GenAI Software Engineer
Senior engineer building and shipping production-grade GenAI systems for ad creative generation, including multimodal models and interactive playables. Requires 5+ years experience, strong Python/JS skills, and proven LLM production experience.