Cloud Infrastructure Engineer
Designs, deploys, and improves scalable blockchain infrastructure using Kubernetes, Terraform, and cloud tools. Drives AI enablement, builds observability with Prometheus/Grafana, manages multi-cloud networks, and leads incident response. Requires 5+ years in SRE/infrastructure with strong automation focus.
What You'll Do
- Architect and operate scalable, self-healing infrastructure leveraging Kubernetes, Terraform, and cloud-native tools across multi-region deployments.
- Drive AI enablement across engineering — ensuring repos, tooling, and workflows are optimized for agentic development with tools like Claude Code, Cursor, and Codex.
- Build AI-powered infrastructure tooling and automation (e.g., automated K8s upgrades, IaC plan analysis, cost optimization advisors, MCP servers, n8n workflows).
- Build and maintain internal developer platform (IDP) capabilities for self-service deployments, observability, and reliability.
- Develop observability frameworks using Prometheus and Grafana for metrics, dashboards, and alerting.
- Lead incident management with blameless post-mortems; define and enforce SLIs, SLOs, and error budgets across services.
- Design and manage multi-cloud, multi-region network architecture — VPC design, IPAM, DNS (Cloudflare), cross-cloud connectivity, security groups, and edge-proxy/istio gateway configuration.
- Collaborate with security teams to embed compliance into infrastructure, including IaC scanning and runtime protection.
- Provide technical leadership and mentorship to elevate the team's operational capabilities.
What We're Looking For
- 5+ years as an Infrastructure Engineer focused on reliability (SRE, Production Engineer, Platform Engineer).
- Experience driving company-wide reliability efforts, including SLO frameworks and error budget policies.
- Strong proficiency with observability stacks: OpenTelemetry, Prometheus/Grafana.
- Deep experience with cloud infrastructure (AWS/GCP), Kubernetes, and multi-region architectures.
- Skilled with Terraform, Helm, and GitOps workflows (e.g., ArgoCD) with an automation-first mindset.
- Experience leveraging agentic development tools (Claude Code, Cursor, Codex) and workflow automation (n8n) to accelerate IaC and build internal tooling is a strong plus.
- Solid networking fundamentals — VPC design, DNS, IPAM, security groups, cross-cloud connectivity, and service mesh (e.g., Istio) experience is a plus.
- Calm and effective incident responder with a focus on systemic improvement.
- Strong cross-functional communicator across SRE, security, and product engineering.
- Blockchain infrastructure, distributed systems, or high-throughput RPC experience — not required but a plus.
Benefits and Perks
- Medical, Dental, & Vision
- Gym Reimbursement
- Home Office Build-out Budget
- In-Office Group Meals
- Wellbeing & Mental Health Perks
- Learning & Development Stipend
- Company Sponsored Conferences & Events
- HSA and FSA Plans
- Fertility Benefits
- Competitive compensation including base salary and equity
- 401k
- Unlimited flexible time off
Senior Network Engineer
Design, deploy, and operate enterprise network infrastructure for corporate facilities and hybrid cloud environments with zero-trust architecture and compliance requirements. Requires 5+ years enterprise networking experience and ability to obtain TS/SCI clearance.
Site Reliability Engineer
Senior or Staff Site Reliability Engineer focused on continuous delivery infrastructure using Argo Workflows, ArgoCD, and Kubernetes. Owns deployment tooling, onboarding flows, and participates in 24/7 on-call. Requires 6+ years building and operating distributed systems.