Skip to content

Data Engineer

United StatesData EngineeringRemote
Summary

Build and scale data pipelines and infrastructure using Python, Airflow, dbt, and Snowflake to power patient outcomes in a high-growth healthcare startup. Requires strong Python/SQL expertise, orchestration mastery, and startup mindset for handling sensitive medical data.

About the role

What You’ll Do

  • Architect Robust Pipelines: Design, build, and optimize scalable data pipelines using Airflow, Python, dbt, and Snowflake. You will replace brittle manual processes with resilient, automated workflows.
  • Build Infrastructure as Code: Manage and evolve our cloud infrastructure (AWS/GCP) using Terraform, ensuring our platform is reproducible, secure, and scalable.
  • Elevate Code Quality: Write clean, production-grade code for complex data processing. You will champion engineering best practices, including code reviews, testing, and CI/CD.
  • Optimize Data Models: Collaborate with analysts to design performant SQL transformations and data models in Snowflake (experience with dbt is a huge plus).
  • Ensure Data Reliability: Implement observability and monitoring to catch issues before they impact stakeholders. You are the first line of defense for data quality.
  • Partner Cross-Functionally: Work closely with Data Analysts and Product Managers to understand their data needs and deliver high-quality data products that empower decision-making.

What You Bring to the Table

  • Strong Python Proficiency: You are comfortable writing modular, testable, and efficient Python code for data processing and automation.
  • Advanced SQL & Snowflake: You have deep expertise in SQL and cloud data warehousing (Snowflake preferred), understanding how to optimize queries for performance and cost.
  • Orchestration Mastery: Proven experience building and maintaining complex workflows using Airflow (or similar tools).
  • Infrastructure Mindset: Familiarity with Terraform and cloud services (AWS or GCP). You understand how to provision and manage the resources your pipelines run on.
  • Security & Stewardship: You understand the gravity of handling sensitive medical data. You are experienced in properly handling PHI and PII, implementing secure access controls (RBAC), and adhering to strict governance standards.
  • Startup DNA: You are a self-starter who is comfortable with ambiguity. You take ownership of problems and are willing to wear many hats to get the job done.
  • Communication Skills: You can translate complex technical challenges into clear options for non-technical stakeholders.

Bonus Points

  • dbt Expertise: Experience using dbt to manage transformations and implement testing/documentation standards.
  • Healthcare Background: Experience working with healthcare data standards or strictly regulated environments (HIPAA) a plus.
  • Containerization: Experience with Docker and Kubernetes for deploying data applications.

Our Tech Stack

  • Compute & Storage: Snowflake, Postgres
  • Orchestration: Airflow
  • Infrastructure: Terraform, AWS/GCP
  • Transformation: dbt, SQL, Python
Skills
PythonSQLSnowflakeAirflowdbtTerraformAWSGoogle CloudPostgresKubernetes