Skip to content

OpenAI Data Engineering Jobs

Open data engineering roles at OpenAI, pulled live from their hiring system.

View data engineering jobs across all companies

12 openOpenAIData Engineering

67% of open data engineering roles call out Python; Airflow and Kubernetes appear in roughly a third. Most of these data engineering roles are on-site or hybrid; 0% are fully remote.

Related roles
Latest data engineering roles at OpenAI
OpenAI

Technical Lead Manager, Data Engineering, Trust & Safety

Lead and grow the Trust & Safety Data Engineering team, defining roadmap and technical strategy. Build privacy-safe datasets and pipelines for abuse detection, fraud detection, and safety monitoring. Partner with stakeholders to ensure launch readiness and operational rigor.

385k – 490kSan Francisco, CAData EngineeringOn-siteSQLSpark
OpenAI

IT Controls Data Engineer

As an IT Controls Data Engineer, you will build and maintain data infrastructure for audit readiness, IT controls, and continuous control monitoring. This role involves designing pipelines, datasets, and automated validation to ensure reliable control data.

293k – 385kSan Francisco, CAData EngineeringHybridSQLAWS
OpenAI

Senior Data Engineer, Core Experimentation

Build and manage data pipelines and canonical datasets for experimentation platform, tracking product metrics like user growth and revenue. Collaborate with cross-functional teams at OpenAI; requires 3+ years data engineering experience with Spark, ETL tools, and distributed systems.

293k – 325kBellevue, WA +1Data EngineeringHybridS3Java
OpenAI

Data Engineer, People Innovation Labs

Build and manage data pipelines for people analytics and internal products like OpenHouse at OpenAI's People Innovation Labs. Collaborate with analytics and engineering teams using Databricks, Spark, and ETL tools; requires 3+ years data engineering experience.

293k – 325kSan Francisco, CAData EngineeringHybridS3Java
OpenAI

Lead to Opportunity, Data Systems Engineer

Build and integrate Salesforce lead-to-opportunity systems for GTM teams, focusing on data enrichment, workflow automation, and cross-system orchestration to drive pipeline creation at scale. Requires advanced Salesforce development expertise and cross-functional collaboration.

260k – 288kSan Francisco, CAData EngineeringOn-siteApexSDLC
OpenAI

Software Engineer, Research - Human Data

Build full-stack systems, tools, and infrastructure for human feedback collection, AI model alignment, and evaluation. Collaborate with researchers to scale production systems and enhance model safety in a fast-paced environment.

131k – 385kSan Francisco, CAData EngineeringHybridReactPython
OpenAI

Software Engineer, Distributed Data Systems (Sora)

Designs and scales distributed data infrastructure for large-scale multimodal training and evaluation at OpenAI. Collaborates with researchers to build reliable, high-performance systems handling massive data volumes in a fast-paced environment.

230k – 385kSan Francisco, CAData EngineeringHybridAWSKubernetes
OpenAI

Software Engineer, Data Infrastructure - Research

Designs and implements dataset infrastructure for OpenAI's large-scale LLM training stack, including standardized APIs for multimodal data, scaling pipelines across GPU fleets, and performance debugging. Requires strong distributed systems experience and collaboration with researchers.

250k – 380kSan Francisco, CAData EngineeringOn-siteGPUAPIs
OpenAI

Software Engineer, Habitat (Online Data)

Builds and operates Habitat, OpenAI's core online database platform handling high-QPS, latency-sensitive workloads. Owns end-to-end distributed systems for storage, caching, routing, CDC, and privacy; requires 8+ years experience with Rust/Python expertise.

230k – 385kSeattle, WAData EngineeringOn-siteCDCRust
OpenAI

Software Engineer, Data Infrastructure

Builds and operates scalable data infrastructure including compute fleets, storage systems, and streaming platforms to support OpenAI's AI products, research, and analytics. Requires 4+ years in data or infrastructure engineering with expertise in Spark, Kafka, and distributed systems.

185k – 385kSan Francisco, CAData EngineeringHybridSparkKafka
OpenAI

Data Engineer, Analytics

Build and manage data pipelines and canonical datasets for product metrics, safety systems, and business decisions. Collaborate with cross-functional teams including Data Science and Research; requires 3+ years data engineering experience with Spark, ETL tools, and distributed systems.

230k – 385kSan Francisco, CAData EngineeringOn-siteS3Java
OpenAI

Software Engineer, Data Acquisition

Builds and leads data acquisition systems including web crawling, ingestion, and scalable distributed processing for model training. Requires 4+ years experience, expertise in Kubernetes and large-scale data systems, and BS/MS/PhD in Computer Science.

293k – 385kSan Francisco, CAData EngineeringOn-siteKubernetesWeb Crawling