Skip to content

Senior Storage Systems Engineer

149k – 161kSan Francisco, CAOnsite5+ YOE
Summary

Senior Storage Systems Engineer manages VAST Data and Pure Storage flash arrays for high-performance AI/HPC workloads, handling administration, performance monitoring, non-disruptive upgrades, data protection, Tier 3 support, and automation. Requires 5+ years storage experience, Linux proficiency, and protocol expertise.

About the role

What You'll Be Working On

  • Flash Array Administration: Own the end-to-end management of VAST Data (Universal Storage) and Pure Storage (FlashBlade/FlashArray) environments, including initial setup, volume provisioning, and export management.
  • Performance Monitoring: Proactively monitor VAST and Pure clusters for IOPS, throughput, and latency bottlenecks, ensuring storage performance stays ahead of GPU demand.
  • Non-Disruptive Operations: Execute software upgrades (Purity//FB, VAST OS), expansion of D-Nodes/C-Nodes, and hardware refreshes with zero downtime for our AI customers.
  • Data Protection: Manage snapshots, replication policies, and data reduction (deduplication/compression) strategies to optimize TCO while ensuring 100% data durability.
  • Tier 3 Support: Act as the lead technical point of contact for storage incidents, working directly with VAST and Pure support engineering to resolve complex fabric or metadata issues.
  • Integration & Automation: Use APIs (REST, Python) to automate provisioning and integrate storage health metrics into our centralized observability stack (Grafana/Prometheus).

What You'll Bring to the Team

Technical Experience: 5–8+ years of experience in Storage Administration, with at least 3+ years of hands-on experience managing VAST Data or Pure Storage in a production environment.

Protocol Expertise: Deep understanding of NFS over RDMA, SMB, and NVMe-oF, and how they are implemented within VAST and Pure architectures.

Linux Systems Mastery: Strong command of the Linux CLI, specifically for mounting, tuning, and troubleshooting high-performance file systems.

Network Awareness: Understanding of how storage interacts with InfiniBand and RoCE fabrics to ensure low-latency data delivery to GPU nodes.

Scripting Skills: Proficiency in Python, Bash, or similar for automating volume creation, quota management, and reporting via storage APIs.

Operational Discipline: A meticulous approach to capacity planning and documentation, ensuring the environment remains stable as we add petabytes of scale.

Bonus Points

  • Experience with Pure1 or VAST VMS/Insight for predictive analytics and capacity forecasting.
  • Familiarity with Slurm or Kubernetes (CSI) integration with high-performance storage.
  • Prior experience in a "Large Scale" environment (multi-petabyte footprints).

Benefits

  • Competitive compensation and equity packages
  • Restricted Stock Units
  • Paid time off, paid holidays & leave of absence programs
  • Comprehensive health, dental & vision insurance
  • Employer contributions to HSA account
  • Paid parental leave
  • Paid life insurance, short-term and long-term disability
  • Professional development & tuition reimbursement
  • Mental health & wellness support
  • Commuter benefits (parking & transit)
  • Cell phone stipend
  • 401(k) Retirement plan with company match up to 4% of salary
  • Volunteer time off
  • Global travel insurance & emergency assistance
  • Daily meals allowance
  • Additional perks & programs specific to location

Compensation Range

Compensation will be paid in the range of up to $148,500 - $161,000 + Bonus. Restricted Stock Units are included in all offers.

Skills
VAST DataPure StorageFlashBladeFlashArrayNFS over RDMANVMe-oFLinuxInfiniBandRoCEPythonBashREST APIGrafanaPrometheusKubernetes
Similar roles at this salary range
All DevOps / SRE jobs →
Ai2

Senior Software Engineer, AI Infrastructure

Senior engineer building and operating large-scale HPC infrastructure for AI model training. Owns job scheduling, automation, and performance optimization across GPU clusters.

126k – 189kSeattle, WADevOps / SREOn-siteGoSRE
Aurelian

Senior Infrastructure Engineer

Build analytics infrastructure, observability tooling, and developer platforms to support real-time AI agents for 911 centers. Requires 4+ years infrastructure/platform/backend experience and comfort across the full stack.

150k – 200kSeattle, WADevOps / SREOn-siteLoggingClickHouse
Huntress

Senior Developer Experience Engineer

Senior Platform Engineer focused on Developer Experience building tools, automation, CI/CD systems, and AI tooling to improve developer productivity and workflows. Requires 7+ years cloud experience, containerization, and proficiency in Ruby, Go, or Python.

160k – 190kUnited StatesDevOps / SRERemoteGoRuby
Mozilla

Senior Site Reliability Engineer

Senior SRE to operate and evolve EKS Kubernetes platform, CI/CD pipelines, and observability stack for Thunderbird's open-source infrastructure. Requires 7+ years infrastructure experience and strong production Kubernetes and IaC skills.

123k – 144kUnited StatesDevOps / SRERemoteAWSIAM
Mozilla

Senior Site Reliability Engineer

Senior SRE to operate and evolve an EKS-based Kubernetes platform, CI/CD pipelines, and observability stack on AWS. Requires 7+ years infrastructure/SRE experience with production Kubernetes and IaC fluency.

123k – 144kUnited StatesDevOps / SRERemoteEKSAWS