Data Engineer
United StatesData EngineeringRemote
Summary
Build and scale data pipelines and infrastructure using Python, Airflow, dbt, and Snowflake to power patient outcomes in a high-growth healthcare startup. Requires strong Python/SQL expertise, orchestration mastery, and startup mindset for handling sensitive medical data.
About the role
What You’ll Do
- Architect Robust Pipelines: Design, build, and optimize scalable data pipelines using Airflow, Python, dbt, and Snowflake. You will replace brittle manual processes with resilient, automated workflows.
- Build Infrastructure as Code: Manage and evolve our cloud infrastructure (AWS/GCP) using Terraform, ensuring our platform is reproducible, secure, and scalable.
- Elevate Code Quality: Write clean, production-grade code for complex data processing. You will champion engineering best practices, including code reviews, testing, and CI/CD.
- Optimize Data Models: Collaborate with analysts to design performant SQL transformations and data models in Snowflake (experience with dbt is a huge plus).
- Ensure Data Reliability: Implement observability and monitoring to catch issues before they impact stakeholders. You are the first line of defense for data quality.
- Partner Cross-Functionally: Work closely with Data Analysts and Product Managers to understand their data needs and deliver high-quality data products that empower decision-making.
What You Bring to the Table
- Strong Python Proficiency: You are comfortable writing modular, testable, and efficient Python code for data processing and automation.
- Advanced SQL & Snowflake: You have deep expertise in SQL and cloud data warehousing (Snowflake preferred), understanding how to optimize queries for performance and cost.
- Orchestration Mastery: Proven experience building and maintaining complex workflows using Airflow (or similar tools).
- Infrastructure Mindset: Familiarity with Terraform and cloud services (AWS or GCP). You understand how to provision and manage the resources your pipelines run on.
- Security & Stewardship: You understand the gravity of handling sensitive medical data. You are experienced in properly handling PHI and PII, implementing secure access controls (RBAC), and adhering to strict governance standards.
- Startup DNA: You are a self-starter who is comfortable with ambiguity. You take ownership of problems and are willing to wear many hats to get the job done.
- Communication Skills: You can translate complex technical challenges into clear options for non-technical stakeholders.
Bonus Points
- dbt Expertise: Experience using dbt to manage transformations and implement testing/documentation standards.
- Healthcare Background: Experience working with healthcare data standards or strictly regulated environments (HIPAA) a plus.
- Containerization: Experience with Docker and Kubernetes for deploying data applications.
Our Tech Stack
- Compute & Storage: Snowflake, Postgres
- Orchestration: Airflow
- Infrastructure: Terraform, AWS/GCP
- Transformation: dbt, SQL, Python
Skills
PythonSQLSnowflakeAirflowdbtTerraformAWSGoogle CloudPostgresKubernetes