Data Engineer

Builds and scales internal data platform by designing data models, pipelines, and analytics infrastructure to transform raw product/business data into reliable datasets for company-wide decision-making. Partners with stakeholders across Product, Engineering, Finance, Marketing, and Sales.

180k – 250kSan Francisco, CANew York, NYData EngineeringHybrid

Apply

About the role

Responsibilities

Design and maintain core data models and semantic layers
Develop and orchestrate batch and streaming data pipelines using technologies such as Apache Beam, Kafka, Airflow, or similar frameworks
Analyze inference and infrastructure telemetry, including data from OpenTelemetry, Grafana, and other observability tools
Define and maintain company-wide metrics across product usage, performance, and customer lifecycle
Enable self-service analytics through agents and tools, with well-structured semantic layers and context
Ensure data reliability and quality through testing, documentation, and governance

Preferred Qualifications

Understanding of inference metrics such as latency, throughput, token usage, and model performance
Experience supporting B2B SaaS and/or consumption-based platforms
Application of forecasting and predictive modeling (e.g., ARIMA, Prophet) to business processes

Benefits

Competitive compensation, including meaningful equity
100% coverage of medical, dental, and vision insurance for employee and dependents
Generous PTO policy including company wide Winter Break
Paid parental leave
Company-facilitated 401(k)

Skills

Apache BeamKafkaAirflowOpenTelemetryGrafanaData PipelinesSemantic LayersBatch ProcessingStreaming DataData Modeling

Similar roles

Data Engineering jobs

Actively AI

Data Engineer

Own data and analytics end-to-end: architect internal systems, build metrics/dashboards, and translate customer and product signals into structured inputs for AI agents.

180k – 210kNew York, NYData EngineeringOn-siteSQLdbt

xAI

Software Engineer - Data Platform

Builds and operates petabyte-scale data platform infrastructure using Kafka, Spark, Flink, and Trino to power real-time ML pipelines and analytics. Requires expertise in distributed systems, stream processing, and systems languages like Rust, Go, or Scala.

180k – 440kPalo Alto, CA +1Data EngineeringHybridGoHdfs

Exa

Software Engineer, Distributed Data Systems

Architects and builds massive-scale data infrastructure for web crawling, embedding model training, and real-time search, handling hundreds of petabytes. Requires expertise in lakehouse architectures, distributed processing pipelines, and streaming systems like Kafka and Flink.

180k – 350kSan Francisco, CAData EngineeringOn-siteRayHudi

Datology AI

Software Engineer, Data Infrastructure

Builds and maintains scalable data processing pipelines and backend systems for a data curation platform that optimizes training data for ML models. Partners with researchers to integrate research capabilities, ensuring reliability and security for customer data.

180k – 300kRedwood City, CAData EngineeringOn-siteS3SQL

Airtable

Data Engineer

Designs and owns mission-critical data pipelines to enable decision-making across data science, growth, sales, marketing, and product teams. Requires 5+ years experience with scalable pipelines (preferably Airflow), Python, and advanced SQL.

180k – 222kSan Francisco, CA +4Data EngineeringOn-site5+ YOESQLPython