Skip to content

Platform Engineer

Builds backend infrastructure and core platform for AI agent cloud, including VM hypervisors, LLM sandboxes, networking, and orchestration. Requires 5+ years in distributed systems and Linux administration for onsite role in San Francisco.

180k – 250kSan Francisco, CADevOps / SREOnsite5+ YOE

About the role

Responsibilities

  • Designing and building the E2B backend and core infrastructure
  • Working with VM hypervisors like Firecracker, gVisor, or Linux systems
  • Building and optimizing runtimes and sandboxes for LLMs
  • Developing networking solutions for secure, isolated environments
  • Monitoring resources and optimizing sandbox performance
  • Solving general infrastructure challenges at scale
  • Working with orchestration technologies like Kubernetes or Nomad
  • Collaborating closely with Distributed Systems Engineers

Requirements

  • 5+ years building infrastructure, especially distributed systems
  • 5+ years of Linux administration - knowledge of Linux fundamentals: bootloader, kernel, package management, networking, storage, namespaces, containers
  • Experience building and operating infrastructure at scale
  • Excited to work in person from San Francisco on a DevTool product
  • Detail-oriented with great taste in design and engineering
  • Comfortable working closely with users
  • Proactive, not afraid to take ownership of part of the product
  • Excited to take projects from 0 → 1 with the support of the team

Benefits

  • Full healthcare, vision, and dental insurance
  • Unlimited PTO

Skills

LinuxKubernetesFirecrackerGvisorNomadDistributed SystemsNamespacesContainersNetworkingSandboxes

Similar roles

DevOps / SRE jobs

Network Engineer, Design & Engineering

Design end-to-end datacenter network architectures for AI training and inference workloads. Own topology selection, fabric design, physical infrastructure integration, and produce deployable HLDs/LLDs across multiple GPU platforms and customer requirements.

180k – 300kNew York, NY +4DevOps / SREOn-site5+ YOEBGPPfc

Developer Productivity Engineer

As a Senior Developer Productivity Engineer, you will own the build, test, and deployment processes for a 50+ person engineering team. You will improve monorepo productivity, drive excellence in testing, and support multi-cloud/multi-region infrastructure to enable fast and safe shipping.

180k – 320kUnited StatesDevOps / SRERemote5+ YOEGoCI/CD

Data Center Network Engineer

Design and own high-performance data center network infrastructure for GPU clusters, including fabric architecture, cabling, and performance validation. Requires deep experience with InfiniBand, RDMA, or high-performance Ethernet at a senior level.

180k – 360kSan Francisco, CA +1DevOps / SREHybrid5+ YOERdmaEthernet

Infrastructure Engineer (Observability)

Builds and operates scalable observability platforms for metrics, logs, traces across GPU, HPC infrastructure. Designs telemetry pipelines, alerting, and multi-tenant systems using Prometheus, Grafana, Kafka; requires 5+ years SRE/infra experience.

180k – 200kNew York, NY +2DevOps / SRERemote5+ YOEGoElk

Infrastructure Engineer (GPU & Compute)

Owns GPU diagnostics, validation workflows, and automation for bare-metal infrastructure supporting AI/ML workloads. Requires 5+ years in systems engineering with strong Linux, Python, and NVIDIA tools expertise.

180k – 200kNew York, NY +2DevOps / SRERemote5+ YOEPxeIpmi