Skip to content

Senior Network Engineer

Austin, TXAtlanta, GADenver, COSeattle, WADevOps / SREHybrid5+ YOE
Summary

Senior Network Engineer responsible for operating and architecting Cloudflare's global edge and backbone network, building AI-powered operational tooling, and mentoring engineers. Requires deep expertise in BGP, MPLS, network automation, and production LLM agent development.

About the role

Role Responsibilities

  • Own the technical operation, engineering, and architecture of Cloudflare's global network, including planning, installation, and day-to-day management of hardware and software across the edge and backbone.
  • Serve as a hands-on operational anchor for the team — diagnosing and resolving complex network faults, owning incident response end-to-end including on-call rotation, and contributing to post-incident reviews to drive continuous improvement.
  • Build production AI agents for network operations. Design, ship, and own LLM-powered tools that integrate with our operational systems of record via tool calling. Ship them with evals, observability of agent decisions, cost tracking, and human-in-the-loop checkpoints where autonomous action carries blast radius.
  • Apply AI to accelerate troubleshooting. Use and extend our internal AI platform (Workers AI, AI Gateway) to speed up root cause analysis, pattern recognition across faults, and operational decision-making under pressure.
  • Architect network improvements that lower latency, reduce packet loss, and increase scale — optimising end user experience across Cloudflare's global infrastructure.
  • Mentor and develop junior engineers on the team, setting technical standards, conducting structured troubleshooting sessions, and building a culture of operational excellence and continuous learning.
  • Collaborate with internal teams on cross-functional network projects, contributing to design documentation, SOPs, and knowledge base articles that outlast any individual contributor.

Role Requirements (Must-Have Skills)

  • Proven track record in large-scale network engineering and operations.
  • Proficiency in Python and/or TypeScript and/or Go sufficient to build, debug, and maintain agent code, not just to glue scripts together.
  • Working knowledge of agent integration patterns: function/tool calling, MCP or equivalent, retrieval over operational corpora (runbooks, postmortems, change history), and prompt iteration with evals.
  • Experience reasoning about agent failure modes in production: hallucination guards, fallback paths, rollback, blast-radius control.
  • Deep expertise in BGP and anycast routing, with the ability to diagnose and resolve complex routing issues in a production environment.
  • Strong understanding of MPLS and Segment Routing.
  • Proficiency across multiple network vendor operating systems (Juniper, Cisco, Arista, or similar).
  • Experience with network automation frameworks such as SaltStack, Ansible, or equivalent, and a strong instinct for solving operational problems through code.
  • Ability to prioritise effectively and lead calmly when faced with high-pressure scenarios.
  • An effective communicator who can adapt their style to any audience — whether guiding a junior engineer through a fault or presenting a network architecture decision to senior leadership.

Nice-to-Have Skills

  • Shipped LLM-powered tooling into production use by a team other than your own, with measurable operational impact.
  • Professional-level network certification (JNCIP, CCNP, or equivalent or higher).
  • Experience with optical transport technologies such as CWDM and DWDM.
  • Linux system administration.
  • Experience writing network configuration and design documentation at an architectural level.
Skills
PythonTypeScriptGoBGPAnycastMPLSSegment RoutingJuniperCiscoAristaSaltStackAnsibleLLM agentsTool calling