Data Engineering & Infrastructure
- Build and maintain scalable ETL pipelines using Python, SQL, and APIs to ingest and process large-scale biometric and sensor data
- Design data models and workflows that support clinical studies, internal tools, and downstream analytics
- Manage data storage, retrieval, and archival systems in AWS, including handling long-term access and data restore workflows
- Ensure data integrity, reproducibility, and proper versioning across evolving datasets and analyses
- Leverage AI-assisted tools to accelerate data analysis, debugging, and code development, improving iteration speed and reducing manual effort
Clinical Analytics & Algorithm Validation
- Analyze sleep, physiological, and behavioral datasets to evaluate product performance and validate new features
- Perform statistical analyses (e.g., correlation, error metrics, bootstrapping, validation frameworks) to assess algorithm accuracy and clinical outcomes
- Develop evaluation pipelines for metrics like HR/HRV accuracy, presence detection, and sleep staging
- Build tools and structured datasets to support training and validation of machine learning models, integrating multiple data sources for supervised learning
- Investigate edge cases, sensor issues, and data anomalies to improve model robustness
Internal Tooling & Visualization
- Maintain and extend Python-based applications for visualizing and annotating biometric data
- Develop interactive tools for researchers and engineers to inspect sessions, validate signals, and debug algorithms
- Streamline workflows for clinical teams to reduce manual effort and improve reproducibility
Cross-Functional Collaboration & Communication
- Partner with Machine Learning, Hardware, Firmware, and Product teams to build algorithms and test prototypes
- Work with Growth and Product teams to explore user behavior and inform feature development
- Synthesize findings into reports, dashboards, and presentations for internal teams and external audiences
- Contribute to abstracts, posters, and conference presentations; communicate uncertainty, methodology, and tradeoffs clearly to guide decision-making
What You’ll Need to Succeed
- 2+ years of data engineering experience with health/physiology data in a research context — you’ve built ETL pipelines around messy, real-world biometric or sensor datasets, not just clean CSVs
- Advanced Python and SQL proficiency — Pandas, NumPy, time-series analysis, and production-quality scripting are daily tools, not occasional ones
- Intermediate-to-advanced signal processing and biometric data experience — you’ve worked directly with heart rate, HRV, sleep staging, or similar physiological signals from wearable or embedded sensors
- Intermediate-to-advanced statistical modeling and validation skills — you can design and execute correlation analyses, error metrics, bootstrapping, and validation frameworks independently
- Working proficiency with AWS and Snowflake — you’ve built or maintained cloud-based data storage, retrieval, and archival systems, not just queried them
Bonus Points
- Experience with clinical or regulatory trial data, familiarity with GCP/ICH guidelines, or prior work supporting FDA submissions
- Background in ML model validation or building structured training datasets for supervised learning
- Fluency with AI-assisted development tools (Claude, Cursor, ChatGPT, Copilot) as part of your daily workflow
- Domain knowledge in sleep science, biometrics, or wearable/embedded sensor data
- Experience integrating internal and third-party APIs into unified data pipelines
- Strong cross-functional communication skills — ability to translate complex analyses into clear insights for non-technical stakeholders
Compensation
Target Base Salary: $110,000 – $130,000. Compensation is based on experience, qualifications, and market benchmarks for the Boston metro area. Equity and performance-based incentives are a significant component of total compensation.