# Staff Data Engineer
**Company:** [Twenty](https://hotfix.jobs/companies/twenty)
**Location:** New York, NY
**Experience:** 8+ years
**Skills:** Apache Iceberg, Delta Lake, Apache Hive, Trino, Presto, Aws Athena, Spark, ClickHouse, Amazon Redshift, Google Bigquery, Airflow, Aws Glue, Kafka, RabbitMQ, Neo4J
**Posted:** 2026-05-19
> Lead data infrastructure for a national security tech company building a high-performance data lake and ETL pipelines for petabyte-scale cyber operations datasets. Requires 8+ years experience, strong data lake expertise, and proven technical leadership.
## Job Description
## Responsibilities
- Lead the development and operation of a data lake for cyber operations and intelligence data
- Design schemas, partitions, and indexes that make complex datasets performant and cost-effective to query
- Partner with engineers and intelligence analysts to define query patterns and data products for mission use cases
- Build and evolve ETL pipelines that are observable, recoverable, and resilient to upstream change
- Drive technical initiatives end-to-end, from architecture decisions through production rollout and iteration
- Establish best practices for data quality, documentation, and operational ownership across the platform
- Mentor engineers on data modeling, performance tuning, and production-grade pipeline design
- Identify bottlenecks in storage/compute/query layers and ship improvements with clear performance wins

## Requirements
- 8+ years of experience in data engineering and/or data architecture
- Mastery-level expertise building ETL pipelines and operating them in production
- Deep experience with data lake architecture and systems used to query data lakes
- Strong schema and index design skills, including partitioning, indexing, and clustering strategies
- Experience with column-oriented databases in production environments
- Built data systems from scratch (not only maintained existing platforms)
- Proven leadership experience mentoring engineers and driving technical initiatives
- U.S. citizen and able to meet the role’s security requirements

## Nice to Have
- Experience with key-value datastores
- Worked with streaming and message queue systems
- Experience with graph database technologies
- Worked with internet/networking datasets (e.g., scan data, DNS, netflow, certificates)
- Experience supporting analysts or operational users with high-stakes data needs

## Tech Environment
- Data lakes: Apache Iceberg, Delta Lake, Apache Hive
- Query engines: Trino, Presto, AWS Athena, Apache Spark
- Column stores: ClickHouse, Amazon Redshift, Google BigQuery
- ETL / orchestration: Airflow, AWS Glue, NiFi, ClickPipe
- Streaming / queues: Kafka, RabbitMQ, NATS, AWS Kinesis
- Graph: Neo4j, AWS Neptune, Memgraph, Apache AGE
**Apply:** https://hotfix.jobs/jobs/staff-data-engineer-at-twenty-d65846cd-5670-4150-be9e-a0a45e7045bb
**Canonical:** https://hotfix.jobs/jobs/staff-data-engineer-at-twenty-d65846cd-5670-4150-be9e-a0a45e7045bb