# Staff Data Engineer
**Company:** [Artisan](https://hotfix.jobs/companies/artisan)
**Location:** San Francisco, CA, New York, NY
**Salary:** $200K-$300K
**Skills:** Python, SQL, dbt, Airflow, Dagster, ETL, ELT, AWS, GCP, Azure, Data Lakes, Data Warehouses, Vector Databases, APIs, JSON
**Posted:** 2025-12-03
> Builds and maintains scalable data pipelines for processing massive lead datasets and real-time intent signals. Owns data ingestion, modeling, ML dataset preparation, quality monitoring, and optimization in a fast-paced AI startup.
## Job Description
## Responsibilities
- Design, build, and maintain scalable data pipelines that process and transform large volumes of structured and unstructured data
- Manage ingestion from third-party APIs, internal systems, and customer datasets
- Develop and maintain data models, data schemas, and storage systems optimized for ML and product performance
- Collaborate with ML engineers to prepare model-ready datasets, embeddings, feature stores, and evaluation data
- Implement data quality monitoring, validation, and observability
- Work closely with product engineers to support new features that rely on complex data flows
- Optimize systems for performance, cost, and reliability
- Contribute to early architecture decisions, infrastructure design, and best practices for data governance
- Build tooling that enables the entire team to access clean, well-structured data

## Requirements
- **3+ years of experience** as a Data Engineer
- Proficiency in **Python**, **SQL**, and modern data tooling (**dbt**, **Airflow**, **Dagster**, or similar)
- Comfort working in fast, ambiguous environments
- Experience designing and operating **ETL/ELT pipelines** in production
- Experience with cloud platforms (**AWS**, **GCP**, or **Azure**)
- Familiarity with **data lakes**, **warehouses**, and **vector databases**
- Experience integrating **APIs** and working with semi-structured data (**JSON**, logs, event streams)
- Strong understanding of **data modeling** and optimization

## Nice-to-haves
- Experience supporting **LLMs**, **embeddings**, or **ML training pipelines**
- Startup experience
**Apply:** https://hotfix.jobs/jobs/staff-data-engineer-at-artisan-fe6e77f1-cc1d-4249-8e01-6177addf2e1c
**Canonical:** https://hotfix.jobs/jobs/staff-data-engineer-at-artisan-fe6e77f1-cc1d-4249-8e01-6177addf2e1c