Member of Technical Staff, Infrastructure
Infrastructure engineer scales multi-cluster, multi-cloud systems handling millions of concurrent voice calls to 100s of millions, owning services like Anycast routers and GPU clusters. Requires experience scaling massive resilient systems from Series B+ stages.
Responsibilities
- Ramp on multi-cluster, multi-cloud infrastructure (30 days).
- Deliver new services like Anycast Global Router (60 days).
- Own domains like GPU inference clusters (90 days).
Requirements
- Experience scaling from Series B to F rounds.
- Proven track record scaling massive resilient and performant systems.
Nice-to-Haves
- Built your own startup.
Compensation & Benefits
- Competitive salary.
- Excellent equity ownership.
- Comprehensive health coverage (medical, dental, vision).
- Flexible time off.
- Catered meals, transportation, gym, $10k annual L&D budget.
- Quarterly off-sites.
Senior Network & Site Reliability Engineer
Design, operate, and automate the global network and reliability layer for a high-performance NVIDIA DGX SuperPOD supporting ML workloads. Own architecture, observability, incident response, and security for mission-critical infrastructure.
Senior Software Engineer - Observability Visibility
Senior engineer building observability and resilience standards, tooling, and automation to make reliability the default across Datadog services. Requires 5+ years experience, Go/Python skills, and AI feature delivery experience.
Senior Manager, DevOps Engineering
Lead and mentor a team of DevOps and Infrastructure Engineers responsible for build pipelines, CI/CD systems, developer tooling, and release infrastructure across Hivemind Solutions. Drive modernization of C++/Python build ecosystems and ensure scalable, secure software delivery pipelines.
Staff Software Engineer
Staff Software Engineer owning technical strategy and systems for Coinbase's test infrastructure at scale. Focus on fast, reliable test signals through orchestration, smart selection, sharding, and flakiness remediation.
Staff Engineer, AI Productivity
Staff-level engineer building infrastructure, tooling, and documentation to make AI coding agents dramatically more productive across the codebase. Owns agentic dev environments, MCP integrations, and agent context.