Skip to content

Staff Operations Engineer

United StatesDevOps / SRERemote6+ YOE
Summary

Lead design, reliability, and evolution of hybrid-cloud and workplace infrastructure. Own architecture, drive complex projects, mentor engineers, and ensure scalable, secure systems across teams.

About the role

Domain Architecture & Technical Direction

  • Own and evolve architecture within a defined infrastructure domain
  • Design and implement scalable, reliable systems spanning multiple teams or environments
  • Establish and promote best practices, patterns, and standards within the domain
  • Contribute to medium- and long-term technical strategy (typically 6-18 months)

Complex Problem Solving & Execution

  • Lead delivery of ambiguous, high-impact infrastructure projects
  • Break down elaborate system problems into implementable solutions
  • Drive migrations, re-architectures, and performance/reliability improvements
  • Remain hands-on with critical systems and implementations

Cross-Team Collaboration

  • Work across teams (IT, SRE, Security, Service Owners) to unify solutions
  • Influence technical decisions through design reviews and collaboration
  • Ensure systems integrate cleanly across infrastructures (office, DC, cloud)

Reliability, Operations & Scaling

  • Improve system reliability through monitoring, alerting, and operational design
  • Contribute to defining SLIs/SLOs and capacity planning within the domain
  • Participate in and lead root cause analysis for complex incidents
  • Decrease operational toil through automation and system improvements

Infrastructure & Networking Depth

  • Design and support core infrastructure components (compute, DNS, networking, identity, etc.)
  • Drive improvements in performance, scalability, and dependability
  • Contribute deep expertise in at least one area (e.g., DNS, network architecture, cloud infra)

Automation & Tooling

  • Build and improve automation using scripting and Infrastructure as Code
  • Contribute to internal tooling and platform improvements
  • Promote repeatable, standardized approaches to system management

Mentorship & Technical Guidance

  • Mentor engineers and guide system design and troubleshooting
  • Raise the technical quality of the team through reviews and shared practices
  • Act as a go-to resource within the domain

Documentation & Operational Clarity

  • Maintain clear documentation, diagrams, and runbooks for systems owned
  • Ensure systems are understandable and operable by others
  • Contribute to knowledge sharing across teams

What you’ll bring

  • 6+ years of experience in systems engineering or infrastructure roles
  • Strong experience designing and operating production infrastructure
  • Solid expertise in: VMware, Cisco UCS, Application/Network Loadbalancers, Linux/Unix Operating Systems, Networking fundamentals (DNS, TCP/IP, routing, firewalls), Data center environments
  • Demonstrated ability to lead complex technical work across teams

Preferred Skills

  • Experience with Infrastructure as Code (Puppet/Ansible/etc)
  • Python
  • Familiarity with observability tooling and reliability practices
  • Experience with containerization and modern platform tooling
  • Exposure to security best practices in infrastructure design

Bonus Skills/Competencies

  • GitHub Enterprise Administration
  • User Access Management
  • GH Org / Repo Management
  • SSO / SCIM Integration
  • Policy Enforcement
Skills
VMwareCisco UCSLoad BalancersLinuxUnixDNSTCP/IPNetworkingFirewallsInfrastructure as CodePuppetAnsiblePythonObservabilityContainerization