Skip to content

Staff Data Engineer

Lead data infrastructure for a national security tech company building a high-performance data lake and ETL pipelines for petabyte-scale cyber operations datasets. Requires 8+ years experience, strong data lake expertise, and proven technical leadership.

New York, NYData EngineeringOnsite8+ YOE

About the role

Responsibilities

  • Lead the development and operation of a data lake for cyber operations and intelligence data
  • Design schemas, partitions, and indexes that make complex datasets performant and cost-effective to query
  • Partner with engineers and intelligence analysts to define query patterns and data products for mission use cases
  • Build and evolve ETL pipelines that are observable, recoverable, and resilient to upstream change
  • Drive technical initiatives end-to-end, from architecture decisions through production rollout and iteration
  • Establish best practices for data quality, documentation, and operational ownership across the platform
  • Mentor engineers on data modeling, performance tuning, and production-grade pipeline design
  • Identify bottlenecks in storage/compute/query layers and ship improvements with clear performance wins

Requirements

  • 8+ years of experience in data engineering and/or data architecture
  • Mastery-level expertise building ETL pipelines and operating them in production
  • Deep experience with data lake architecture and systems used to query data lakes
  • Strong schema and index design skills, including partitioning, indexing, and clustering strategies
  • Experience with column-oriented databases in production environments
  • Built data systems from scratch (not only maintained existing platforms)
  • Proven leadership experience mentoring engineers and driving technical initiatives
  • U.S. citizen and able to meet the role’s security requirements

Nice to Have

  • Experience with key-value datastores
  • Worked with streaming and message queue systems
  • Experience with graph database technologies
  • Worked with internet/networking datasets (e.g., scan data, DNS, netflow, certificates)
  • Experience supporting analysts or operational users with high-stakes data needs

Tech Environment

  • Data lakes: Apache Iceberg, Delta Lake, Apache Hive
  • Query engines: Trino, Presto, AWS Athena, Apache Spark
  • Column stores: ClickHouse, Amazon Redshift, Google BigQuery
  • ETL / orchestration: Airflow, AWS Glue, NiFi, ClickPipe
  • Streaming / queues: Kafka, RabbitMQ, NATS, AWS Kinesis
  • Graph: Neo4j, AWS Neptune, Memgraph, Apache AGE

Skills

Apache IcebergDelta LakeApache HiveTrinoPrestoAws AthenaSparkClickHouseAmazon RedshiftGoogle BigqueryAirflowAws GlueKafkaRabbitMQNeo4J

Staff Data Engineer

Founding Data Engineer to architect Payabli's data platform from scratch: design lakehouse/warehouse, build pipelines, model financial data, and establish governance for a regulated fintech environment.

FloridaData EngineeringRemote8+ YOESQLdbt

Staff Engineer - Data Platform

Staff-level technical lead and architect for Haus's data ingestion and normalization platform. Owns schema evolution, data contracts, DQ frameworks, lineage, and pipeline observability in a GCP/BigQuery/dbt stack. Partners with DS and Product teams.

240k – 260kSeattle, WA +1Data EngineeringHybrid10+ YOESQLdbt

Staff Data Engineer

Staff Data Engineer building and scaling data pipelines, integrations, and workflow orchestration systems. Owns architecture, IaC strategy, and technical leadership across large-scale data infrastructure.

200k – 260kUnited StatesData EngineeringRemote7+ YOEPythonPrefect

Data Engineer

Hands-on Data Engineer building the core data layer for a fast-growing AI observability startup. Own data models, pipelines, and trusted metrics across product usage, revenue, and GTM systems while partnering with Sales, RevOps, Marketing, and Finance.

San Francisco, CA +1Data EngineeringOn-site10+ YOESQLCRM

Senior Staff Data Engineer

Senior data engineer defining long-term data strategy, designing scalable pipelines, and leading cross-functional initiatives. Requires 8+ years experience, strong PySpark/SQL/Python skills, and expertise in Snowflake, Spark, Airflow, and dbt.

United StatesData EngineeringRemote8+ YOEC#Go