Storage Systems Administrator II

129k – 151kSan Francisco, CAOnsite2+ YOEApr 6

Summary

Manages daily operations, health monitoring, maintenance, and troubleshooting of VAST Data and Pure Storage all-flash systems to support high-performance AI workloads. Requires 2-6 years storage/systems admin experience, Linux proficiency, scripting, and high-performance protocols.

About the role

What You’ll Be Working On

Storage Operations

Manage the daily administration of VAST Data and Pure Storage environments, including volume provisioning, export management, and quota adjustments.

Health & Monitoring

Use tools like Grafana and Prometheus to monitor cluster health, tracking IOPS and latency to identify potential bottlenecks before they impact users.

Maintenance & Upgrades

Assist in executing non-disruptive software upgrades (VAST OS, Purity//FB) and hardware expansions to keep our infrastructure modern and secure.

Data Integrity

Implement and verify snapshot schedules and replication policies to ensure data durability and successful recovery points.

Troubleshooting

Resolve storage-related tickets and performance issues, collaborating with senior engineers and vendor support (VAST/Pure) to minimize downtime.

Task Automation

Write and maintain scripts (Python/Bash) to automate routine administrative tasks, such as reporting on capacity and streamlining user access.

What You’ll Bring to the Team

Technical Experience: 2–6 years of experience in Storage or Systems Administration, with a solid foundation in managing enterprise-grade storage arrays.

Hands-on Flash Experience: Direct experience with VAST Data or Pure Storage (FlashBlade/FlashArray) is highly preferred.

Linux Fundamentals: Strong proficiency with the Linux CLI, including a clear understanding of mounting file systems and basic network configuration.

Protocol Knowledge: Familiarity with high-performance protocols such as NFS (including NFS over RDMA), SMB, or NVMe-oF.

Scripting Ability: Ability to use Python or Bash to interact with APIs or automate repetitive system tasks.

Execution & Care: A detail-oriented approach to documentation and change management, ensuring petabyte-scale environments remain stable.

Bonus Points:

Experience using Pure1 or VAST VMS/Insight for monitoring and capacity planning.
Basic understanding of InfiniBand and RoCE networking.
Experience in a data center environment or with high-performance computing (HPC) workloads.

Benefits

Competitive compensation
Restricted Stock Units
Paid time off & paid holidays
Comprehensive health, dental & vision insurance
Employer contributions to HSA account
Paid parental leave
Paid life insurance, short-term and long-term disability
Professional development & tuition reimbursement
Mental health & wellness support
Commuter benefits (parking & transit)
Cell phone stipend
401(k) Retirement plan with company match up to 4% of salary

Compensation Range: $128,500 - $151,000 + Bonus. Restricted Stock Units are included in all offers.

Skills

VAST DataPure StorageFlashBladeFlashArrayLinuxNFSNFS over RDMANVMe-oFPythonBashGrafanaPrometheusInfiniBandRoCE

Similar roles at this salary range

All DevOps / SRE jobs →

Ai2

Jun 8

Senior Software Engineer, AI Infrastructure

Senior engineer building and operating large-scale HPC infrastructure for AI model training. Owns job scheduling, automation, and performance optimization across GPU clusters.

126k – 189kSeattle, WADevOps / SREOn-siteGoSRE

Aurelian

Jun 8

Senior Infrastructure Engineer

Build analytics infrastructure, observability tooling, and developer platforms to support real-time AI agents for 911 centers. Requires 4+ years infrastructure/platform/backend experience and comfort across the full stack.

150k – 200kSeattle, WADevOps / SREOn-siteLoggingClickHouse

Mozilla

Jun 8

Senior Site Reliability Engineer

Senior SRE to operate and evolve EKS Kubernetes platform, CI/CD pipelines, and observability stack for Thunderbird's open-source infrastructure. Requires 7+ years infrastructure experience and strong production Kubernetes and IaC skills.

123k – 144kUnited StatesDevOps / SRERemoteAWSIAM

Mozilla

Jun 8

Senior Site Reliability Engineer

Senior SRE to operate and evolve an EKS-based Kubernetes platform, CI/CD pipelines, and observability stack on AWS. Requires 7+ years infrastructure/SRE experience with production Kubernetes and IaC fluency.

123k – 144kUnited StatesDevOps / SRERemoteEKSAWS

Clickhouse

Jun 4

Senior Cloud Engineer

Design, develop, and secure ClickHouse Cloud platforms for regulated and mission-critical environments across cloud, hybrid, and on-prem deployments. Requires 6+ years building scalable distributed systems, Kubernetes expertise, and proficiency in Go or Python.

141k – 230kUnited StatesDevOps / SRERemoteGoAWS

Apply