Skip to content

Staff Data Engineer

Builds and maintains scalable data pipelines for processing massive lead datasets and real-time intent signals. Owns data ingestion, modeling, ML dataset preparation, quality monitoring, and optimization in a fast-paced AI startup.

200k – 300kSan Francisco, CANew York, NYData EngineeringOnsite

About the role

Responsibilities

  • Design, build, and maintain scalable data pipelines that process and transform large volumes of structured and unstructured data
  • Manage ingestion from third-party APIs, internal systems, and customer datasets
  • Develop and maintain data models, data schemas, and storage systems optimized for ML and product performance
  • Collaborate with ML engineers to prepare model-ready datasets, embeddings, feature stores, and evaluation data
  • Implement data quality monitoring, validation, and observability
  • Work closely with product engineers to support new features that rely on complex data flows
  • Optimize systems for performance, cost, and reliability
  • Contribute to early architecture decisions, infrastructure design, and best practices for data governance
  • Build tooling that enables the entire team to access clean, well-structured data

Requirements

  • 3+ years of experience as a Data Engineer
  • Proficiency in Python, SQL, and modern data tooling (dbt, Airflow, Dagster, or similar)
  • Comfort working in fast, ambiguous environments
  • Experience designing and operating ETL/ELT pipelines in production
  • Experience with cloud platforms (AWS, GCP, or Azure)
  • Familiarity with data lakes, warehouses, and vector databases
  • Experience integrating APIs and working with semi-structured data (JSON, logs, event streams)
  • Strong understanding of data modeling and optimization

Nice-to-haves

  • Experience supporting LLMs, embeddings, or ML training pipelines
  • Startup experience

Skills

PythonSQLdbtAirflowDagsterETLELTAWSGCPAzureData LakesData WarehousesVector DatabasesAPIsJSON

Staff Data Engineer

Staff Data Engineer building and scaling data pipelines, integrations, and workflow orchestration systems. Owns architecture, IaC strategy, and technical leadership across large-scale data infrastructure.

200k – 260kUnited StatesData EngineeringRemote7+ YOEPythonPrefect

Member of Technical Staff — ML Data Infra

Build and operate large-scale multimodal data pipelines for AI avatar model training. Design production-grade systems for petabyte-scale video, audio, and text data.

200k – 300kSeattle, WAData EngineeringOn-site5+ YOERayDvc

Staff Data Engineer

As a Staff Data Engineer, you will architect and scale Imprint's data platform, optimizing infrastructure and driving technical excellence. You will build critical financial reporting pipelines, establish data standards, and mentor other engineers.

200k – 250kSan Francisco, CA +1Data EngineeringOn-site10+ YOES3SQL

Senior Staff Data Infrastructure Engineer

Lead and contribute to architectural initiatives for data infrastructure in FedRAMP environments. This role focuses on scalability, cost-efficiency, operational excellence, and security compliance for data-intensive systems.

200k – 220kUnited StatesData EngineeringRemote7+ YOEEmrAWS

Staff Data Architect

Jellyfish is seeking a Staff/Lead Data Architect to design, automate, and scale their next-generation data platform. This role involves maturing core data models, automating environment boundaries, and driving advanced observability and cost-attribution into the data pipeline architecture.

200k – 260kUnited StatesData EngineeringRemoteSQLPython