What You’ll Be Working On
Customer Support
- Provide exceptional technical support to customers via Zendesk, meeting SLAs and maintaining high CSAT (95%+)
On-Call Rotation
- Participate in a 24/7 on-call rotation to ensure timely resolution of critical issues
Troubleshooting
- Diagnose and resolve issues related to VMs, hardware failures, and scaling tests using CLI and internal tools
Alert Triage and Maintenance
- Manage alert triage, prepare for maintenance windows, and conduct node delivery testing
Collaboration
- Work closely with SRE, Networking, and Storage teams from initial triage to root cause analysis (RCA) delivery
Global Teamwork
- Adhere to global team collaboration and handoff processes for ticketing and on-call procedures
Knowledge Sharing
- Develop onboarding/training materials, knowledge base documentation, and standard operating procedures (SOPs)
What You’ll Bring to the Team
Education/Experience
- Bachelor's degree in IT, Computer Science, Engineering, or a related field, or 4+ years of equivalent technical experience
Linux Proficiency
- Strong command-line interface (CLI) skills in Linux environments
Version Control
- Proficiency with Git for code management and collaboration
Customer Support Experience
- 5+ years of experience in a customer support role, ideally within cloud, storage, or networking environments
Cloud Technologies
- Experience with container orchestration (e.g., Kubernetes), workload management (e.g., Slurm, Terraform), and monitoring tools (e.g., Grafana)
Public Cloud Knowledge
- Familiarity with other public cloud platforms (e.g., AWS, Azure, GCP)
Communication Skills
- Excellent communication and customer service skills, including the ability to prioritize competing escalations
HPC Knowledge
- Understanding of HPC technologies such as Infiniband, RDMA, RoCE, and Software Defined Networking (SDN)
Bonus Points
Certifications
- CKA, CKAD, CKS, KCNA, AWS Machine Learning - Specialty, Data Analytics - Specialty, Solutions Architect - Professional, Developer - Associate, NVIDIA AI Infrastructure and Operations, Generative AI and LLMs, Generative AI Multi-modal, Infiniband, Linux Foundation IT Associate, System Administrator
Cloud Expertise
- Deep understanding of specific cloud platforms and services
Automation Skills
- Experience with automation tools and scripting languages
Problem-Solving Abilities
- Demonstrated ability to analyze complex technical issues and develop effective solutions
Collaboration and Mentorship
- Proven ability to mentor, train, and onboard colleagues
Passion for Sustainability
- A strong interest in contributing to a more sustainable future through technology
Benefits
- Competitive compensation
- Restricted Stock Units
- Paid time off & paid holidays
- Comprehensive health, dental & vision insurance
- Employer contributions to HSA account
- Paid parental leave
- Paid life insurance, short-term and long-term disability
- Professional development & tuition reimbursement
- Mental health & wellness support
- Commuter benefits (parking & transit)
- Cell phone stipend
- 401(k) Retirement plan with company match up to 4% of salary
- Volunteer time off
Compensation
- $145,000 – $175,000 + Bonus
- Restricted Stock Units included in all offers