# Data Engineer
**Company:** [Axle](https://hotfix.jobs/companies/axle)
**Location:** Rockville, MD
**Skills:** Python, SQL, ETL, ELT, Spark, Pyspark, Git, Snowflake, Databricks, Data Modeling
**Posted:** 2026-06-24
> Design, build, and maintain data pipelines for biomedical and clinical research datasets. Work with scientists and researchers to deliver accessible, well-governed data products using Python, SQL, and ETL/ELT processes.
## Job Description
## Key Responsibilities

**Data Pipeline Development**
- Design, build, test, and maintain data pipelines to ingest, transform, harmonize, and integrate diverse biomedical and research data sources, including clinical, genomic, experimental, imaging, biospecimen, operational, and other scientific datasets
- Develop reusable transformation logic and curated datasets that support analytics, reporting, dashboards, applications, APIs, and downstream research workflows

**Data Integration and Lifecycle Support**
- Support the full research data lifecycle by enabling reliable data movement from source systems and storage environments into structured, analysis-ready formats
- Assist with data ingestion, curation, metadata capture, data refreshes, source-to-target mapping, schema management, and long-term maintainability of data products and workflows

**Collaboration**
- Work closely with data scientists, bioinformaticians, researchers, application developers, project managers, and government stakeholders to gather requirements and deliver practical data solutions
- Translate scientific and operational data needs into technical specifications, data models, transformation logic, and reusable datasets

**Quality & Governance**
- Implement data validation checks, reconciliation routines, testing practices, and monitoring processes to ensure data accuracy, completeness, consistency, and integrity
- Follow data governance and security best practices, including documentation of transformations, lineage, assumptions, access requirements, and compliance considerations

**Dashboarding & Integration**
- Create or support interactive dashboards, reporting layers, APIs, and application-ready datasets
- Support integration between data pipelines, databases, cloud platforms, analytics environments, and approved application platforms

**Operational Support and Modernization**
- Troubleshoot data pipeline failures, source system inconsistencies, data quality issues, schema changes, access issues, and performance bottlenecks
- Contribute to modernization efforts by improving automation, documentation, scalability, reproducibility, and platform readiness

## Required Qualifications

- Bachelor's degree in Computer Science, Data Science, Bioinformatics, Biomedical Informatics, Information Systems, Engineering, or a related field, or equivalent practical experience
- Proven experience as a Data Engineer, Analytics Engineer, Data Integration Developer, Bioinformatics Engineer, or similar data-intensive role
- Strong proficiency in Python and SQL for data manipulation, transformation, scripting, automation, and analysis
- Hands-on experience building ETL/ELT processes and data pipelines to support large, complex, multi-source datasets
- Familiarity with scalable data processing approaches, including Spark/PySpark or similar frameworks
- Solid understanding of data modeling, relational databases, data warehouses, data lakes, metadata, and database concepts
- Ability to work with complex, multi-modal datasets, including structured, semi-structured, and unstructured data
- Knowledge of software engineering and data engineering best practices, including version control using Git, code review, automated testing, documentation, peer review, and change management
- Experience ensuring data quality and using lineage, provenance tracking, audit trails, or documentation practices
- Excellent problem-solving skills and the ability to communicate effectively with both technical and non-technical stakeholders
- Strong interest in biomedical science, clinical research, healthcare data, and scientific discovery
- Demonstrated awareness of sensitive data handling, privacy, access control, data governance, and regulatory or compliance expectations

## Preferred Qualifications

- Hands-on experience building data solutions in modern data platforms or platform-as-a-service environments such as Snowflake, Databricks, Palantir, cloud data warehouses, data lakes, or similar platforms
- Experience supporting integrations across databases, cloud storage, APIs, analytics platforms, dashboards, and application environments
- Experience preparing curated datasets for dashboards, APIs, web applications, reporting tools, notebooks, or scientific computing environments
- Familiarity with research-facing tools and platforms such as Posit Connect, R/Shiny, Streamlit, Jupyter, Galaxy, Code
**Apply:** https://hotfix.jobs/jobs/data-engineer-at-axle-5799a723-8a21-40fe-9edd-a6009f523559
**Canonical:** https://hotfix.jobs/jobs/data-engineer-at-axle-5799a723-8a21-40fe-9edd-a6009f523559