Skip to content

Senior Software Engineer, Data Platform

United StatesRemote5+ YOE
Summary

Design, build, and scale the company's data infrastructure on GCP using Spark, Databricks, and Fivetran. Own ETL/ELT pipelines, CDC, data governance, and streaming/batch data flows serving analytics, product, and compliance use cases.

About the role

Key Responsibilities

  • Contribute to the design, development, and scaling of core data infrastructure using GCP, Spark, Databricks, and Fivetran
  • Develop robust and maintainable ETL/ELT workflows that support diverse structured and unstructured data needs
  • Implement and manage Change Data Capture (CDC) pipelines to enable near real-time data replication and synchronization
  • Define and enforce data governance and compliance standards, including access control, auditability, lineage, and metadata management
  • Build and manage streaming and batch data pipelines to serve high-impact use cases across analytics, product, compliance, and experimentation
  • Act as a strategic partner to cross-functional teams (product, analytics, engineering, clinical) to ensure data is accessible, trustworthy, and impactful
  • Drive the long-term architectural vision of the data platform to support current and future business and product needs

Requirements

  • 5+ years of experience in software engineering, with a focus on scalable data architectures
  • Strong expertise in GCP (IAM, GCS, Pub/Sub, etc.) and hands-on experience with Spark and Databricks
  • Hands-on experience with CDC technologies like Fivetran, or equivalent
  • Proficiency in ETL/ELT tools and frameworks (dbt, Apache Airflow, Dataform, etc.)
  • Deep understanding of data governance principles, including compliance and security best practices
  • Demonstrated success in collaborating across functions to deliver data solutions for analytics, experimentation, or compliance
  • Balance of IC execution and leadership skills; comfortable rolling up sleeves or mentoring others
  • Familiarity with streaming data architecture, real-time ingestion, and delivery frameworks
  • Proficient in SQL and Python for data processing and automation
  • Strong problem-solving skills with the ability to work in a fast-paced environment
  • Excellent communication and technical storytelling skills

Nice-to-Haves

  • Experience with Terraform or Infrastructure-as-Code (IaC) for data infrastructure automation
  • Background in HIPAA or other regulated environments with sensitivity to data privacy and compliance
  • Familiarity with the dbt Semantic Layer and modern data modeling best practices
  • Exposure to data observability platforms and practices
  • Familiarity with machine learning data pipelines
  • Exposure to multi-cloud or hybrid-cloud environments
  • Experience building scalable solutions in a 0-1 environment
Skills
GCPGoogle CloudSparkDatabricksFivetranChange Data CaptureCDCETLELTdbtApache AirflowDataformSQLPythonTerraform