Staff Operations Engineer
United StatesDevOps / SRERemote6+ YOE
Summary
Lead design, reliability, and evolution of hybrid-cloud and workplace infrastructure. Own architecture, drive complex projects, mentor engineers, and ensure scalable, secure systems across teams.
About the role
Domain Architecture & Technical Direction
- Own and evolve architecture within a defined infrastructure domain
- Design and implement scalable, reliable systems spanning multiple teams or environments
- Establish and promote best practices, patterns, and standards within the domain
- Contribute to medium- and long-term technical strategy (typically 6-18 months)
Complex Problem Solving & Execution
- Lead delivery of ambiguous, high-impact infrastructure projects
- Break down elaborate system problems into implementable solutions
- Drive migrations, re-architectures, and performance/reliability improvements
- Remain hands-on with critical systems and implementations
Cross-Team Collaboration
- Work across teams (IT, SRE, Security, Service Owners) to unify solutions
- Influence technical decisions through design reviews and collaboration
- Ensure systems integrate cleanly across infrastructures (office, DC, cloud)
Reliability, Operations & Scaling
- Improve system reliability through monitoring, alerting, and operational design
- Contribute to defining SLIs/SLOs and capacity planning within the domain
- Participate in and lead root cause analysis for complex incidents
- Decrease operational toil through automation and system improvements
Infrastructure & Networking Depth
- Design and support core infrastructure components (compute, DNS, networking, identity, etc.)
- Drive improvements in performance, scalability, and dependability
- Contribute deep expertise in at least one area (e.g., DNS, network architecture, cloud infra)
Automation & Tooling
- Build and improve automation using scripting and Infrastructure as Code
- Contribute to internal tooling and platform improvements
- Promote repeatable, standardized approaches to system management
Mentorship & Technical Guidance
- Mentor engineers and guide system design and troubleshooting
- Raise the technical quality of the team through reviews and shared practices
- Act as a go-to resource within the domain
Documentation & Operational Clarity
- Maintain clear documentation, diagrams, and runbooks for systems owned
- Ensure systems are understandable and operable by others
- Contribute to knowledge sharing across teams
What you’ll bring
- 6+ years of experience in systems engineering or infrastructure roles
- Strong experience designing and operating production infrastructure
- Solid expertise in: VMware, Cisco UCS, Application/Network Loadbalancers, Linux/Unix Operating Systems, Networking fundamentals (DNS, TCP/IP, routing, firewalls), Data center environments
- Demonstrated ability to lead complex technical work across teams
Preferred Skills
- Experience with Infrastructure as Code (Puppet/Ansible/etc)
- Python
- Familiarity with observability tooling and reliability practices
- Experience with containerization and modern platform tooling
- Exposure to security best practices in infrastructure design
Bonus Skills/Competencies
- GitHub Enterprise Administration
- User Access Management
- GH Org / Repo Management
- SSO / SCIM Integration
- Policy Enforcement
Skills
VMwareCisco UCSLoad BalancersLinuxUnixDNSTCP/IPNetworkingFirewallsInfrastructure as CodePuppetAnsiblePythonObservabilityContainerization