Staff Data Engineer - TS/SCI Cleared

Leads development of high-performance data lakes and ETL pipelines for petabyte-scale cyber operations data. Partners with engineers and analysts to build reliable data systems, drives technical initiatives, and mentors team members. Requires 8+ years experience and TS/SCI clearance.

Arlington, VAData EngineeringOnsite8+ YOE

Apply

About the role

Responsibilities

Lead the development and operation of a data lake for cyber operations and intelligence data.
Design schemas, partitions, and indexes that make complex datasets performant and cost-effective to query.
Partner with engineers and intelligence analysts to define query patterns and data products for mission use cases.
Build and evolve ETL pipelines that are observable, recoverable, and resilient to upstream change.
Drive technical initiatives end-to-end, from architecture decisions through production rollout and iteration.
Establish best practices for data quality, documentation, and operational ownership across the platform.
Mentor engineers on data modeling, performance tuning, and production-grade pipeline design.
Identify bottlenecks in storage/compute/query layers and ship improvements with clear performance wins.

Requirements

8+ years of experience in data engineering and/or data architecture.
Mastery-level expertise building ETL pipelines and operating them in production.
Deep experience with data lake architecture and systems used to query data lakes.
Strong schema and index design skills, including partitioning, indexing, and clustering strategies.
Experience with column-oriented databases in production environments.
Built data systems from scratch (not only maintained existing platforms).
Proven leadership experience mentoring engineers and driving technical initiatives.
U.S. citizen with active TS/SCI security clearance with appropriate polygraph.

Nice To Haves

Experience with key-value datastores.
Worked with streaming and message queue systems.
Experience with graph database technologies.
Worked with internet/networking datasets (e.g., scan data, DNS, netflow, certificates).
Experience supporting analysts or operational users with high-stakes data needs.

Tech Stack

Data lakes: Apache Iceberg, Delta Lake, Apache Hive
Query engines: Trino, Presto, AWS Athena, Apache Spark
Column stores: ClickHouse, Amazon Redshift, Google BigQuery
ETL / orchestration: Airflow, AWS Glue, NiFi, ClickPipe
Streaming / queues: Kafka, RabbitMQ, NATS, AWS Kinesis
Graph: Neo4j, AWS Neptune, Memgraph, Apache AGE

Skills

Apache IcebergDelta LakeApache HiveTrinoPrestoAws AthenaSparkClickHouseAmazon RedshiftGoogle BigqueryAirflowAws GlueApache NifiKafkaEtl Pipelines

Similar roles

Data Engineering jobs

Payabli

Staff Data Engineer

Founding Data Engineer to architect Payabli's data platform from scratch: design lakehouse/warehouse, build pipelines, model financial data, and establish governance for a regulated fintech environment.

FloridaData EngineeringRemote8+ YOESQLdbt

Haus

Staff Engineer - Data Platform

Staff-level technical lead and architect for Haus's data ingestion and normalization platform. Owns schema evolution, data contracts, DQ frameworks, lineage, and pipeline observability in a GCP/BigQuery/dbt stack. Partners with DS and Product teams.

240k – 260kSeattle, WA +1Data EngineeringHybrid10+ YOESQLdbt

Jellyfish

Staff Data Engineer

Staff Data Engineer building and scaling data pipelines, integrations, and workflow orchestration systems. Owns architecture, IaC strategy, and technical leadership across large-scale data infrastructure.

200k – 260kUnited StatesData EngineeringRemote7+ YOEPythonPrefect

Braintrust

Data Engineer

Hands-on Data Engineer building the core data layer for a fast-growing AI observability startup. Own data models, pipelines, and trusted metrics across product usage, revenue, and GTM systems while partnering with Sales, RevOps, Marketing, and Finance.

San Francisco, CA +1Data EngineeringOn-site10+ YOESQLCRM

SmithRx

Senior Staff Data Engineer

Senior data engineer defining long-term data strategy, designing scalable pipelines, and leading cross-functional initiatives. Requires 8+ years experience, strong PySpark/SQL/Python skills, and expertise in Snowflake, Spark, Airflow, and dbt.

United StatesData EngineeringRemote8+ YOEC#Go