Skip to content

Customer Reliability Engineer - Infrastructure

125k – 130kSan Francisco, CAAustin, TXPhiladelphia, PAPittsburgh, PASupport EngineeringRemote5+ YOE
Summary

Infrastructure-focused Customer Reliability Engineer supporting Astronomer's managed Airflow platform. Troubleshoots customer cloud/K8s environments, owns monitoring/alerting, participates in on-call, and drives reliability improvements across AWS, GCP, and Azure.

About the role

What you get to do

  • Provide solutions to customers to make them successful using our products
  • Troubleshoot customer environments and engage in active triaging with customers
  • Participate in on-call rotation for weekend coverage
  • Provide feedback to the product development teams on customer needs and pain points
  • Build out our monitoring and alerting systems
  • Build and maintain automation to ensure daily operational tasks are handled as efficiently as possible
  • Help direct the architecture of the products and contribute where possible
  • Own the customer experience, working directly with customers to prioritize and solve issues, meet SLAs, and provide "white glove" guidance on the path to production
  • Participate remotely within a fully distributed team
  • Enhance and enrich customer documentation
  • Work with the latest technology and multi-cloud implementations

What you bring to the role

  • 5 years of experience, preferably with large, complex cloud infrastructures operating at scale
  • 3 years of experience with Kubernetes
  • Experience managing a Production distributed system with at least one major cloud provider (AWS, GCP, Azure)
  • Strong Linux experience
  • Knowledge of how to operate and monitor issues for distributed systems
  • Previous experience in handling customers issues (internal or external)
  • Strong communication skills
  • DevOps or CI/CD experience
  • Python scripting
  • Good troubleshooting skills

Bonus points if you have

  • Experience as a Site Reliability Engineer
  • Worked with Kubernetes Custom Resources
  • Depth of knowledge with Azure
  • Airflow/Big Data Orchestration experience
  • IaC experience

Compensation

  • Estimated total compensation: $125,000 - $130,000 based on leveling and geography, along with an equity component and a comprehensive benefits package
Skills
KubernetesAWSGCPAzureLinuxPythonDevOpsCI/CDSite Reliability EngineeringInfrastructure as Code
Similar roles at this salary range
All Support Engineering jobs →
Metriport

Customer Support Engineer

First Customer Support Engineer responsible for triaging and resolving technical customer issues end-to-end, building automated support infrastructure, and bridging to engineering and product teams. Requires 3+ years technical support or engineering experience with production code and API familiarity.

120k – 160kSan Francisco, CASupport EngineeringOn-site3+ YOEAPIsWebhooks
Astronomer

Customer Reliability Engineer, Airflow

Provide Apache Airflow expertise and solve complex data engineering issues for enterprise customers on Astronomer's managed Airflow platform. Requires 4+ years Python, 1+ year Airflow admin/DAG experience, Kubernetes, and cloud platform experience.

125k – 130kSan Francisco, CA +10Support EngineeringRemote4+ YOEAWSGCP
Crusoe

Senior Cloud Support Engineer

Provide technical support for Crusoe Cloud's GPU compute platform, troubleshooting VMs, hardware, and scaling issues while participating in 24/7 on-call rotations. Requires 5+ years customer support experience and strong Linux/cloud skills.

125k – 151kDallas, TXSupport EngineeringOn-site5+ YOEGitAWS
Crusoe

Senior Cloud Support Engineer

Provide technical support for Crusoe Cloud's GPU infrastructure, troubleshooting VMs, hardware, and scaling issues while participating in 24/7 on-call rotations. Requires 5+ years customer support experience and strong Linux/cloud skills.

125k – 151kDenver, COSupport EngineeringOn-site5+ YOEGitAWS
Crusoe

Senior Cloud Support Engineer

Provide technical support for Crusoe Cloud's GPU compute platform, troubleshooting VMs, hardware, and scaling issues while participating in 24/7 on-call rotations. Requires 5+ years customer support experience and strong Linux/cloud/HPC skills.

145k – 175kSeattle, WA +1Support EngineeringOn-site5+ YOEGitAWS