Data Engineer, Scaling Analytics
Build and scale data pipelines, models, and reporting systems that power OpenAI's infrastructure operations, capacity planning, and supply chain decisions.
Key Responsibilities
- Design, build, and maintain scalable data pipelines supporting infrastructure deployment, operations, capacity planning, and supply chain functions.
- Develop trusted datasets and reporting systems that provide visibility into hardware inventory, deployment status, site readiness, capacity utilization, and operational performance.
- Partner with cross-functional stakeholders to define metrics, establish data standards, and improve decision-making across infrastructure organizations.
- Create scalable data models that enable consistent reporting and analytics across multiple data sources and operational systems.
- Improve data quality, lineage, observability, and governance practices across critical infrastructure datasets.
- Support executive reporting, operational reviews, forecasting exercises, and strategic planning initiatives through reliable analytical foundations.
- Collaborate with engineering teams to integrate new data sources and operational telemetry into existing analytics ecosystems.
- Build solutions that reduce manual reporting efforts and improve the speed and accuracy of infrastructure decision-making.
- Document systems, processes, and analytical frameworks to improve long-term maintainability and organizational resilience.
Qualifications
- 5+ years of experience building and maintaining production data pipelines and analytical systems.
- Strong proficiency in SQL and experience designing scalable data models.
- Proficiency in Python or another programming language commonly used for data engineering.
- Experience working with modern data warehouses (e.g., Snowflake, BigQuery, Redshift) and orchestration frameworks (e.g., Airflow, Dagster).
- Experience designing reliable ETL/ELT workflows with a focus on maintainability, performance, and operational excellence.
- Experience partnering with cross-functional stakeholders to translate business requirements into technical solutions.
- Experience implementing data quality checks, monitoring, and observability practices in production environments.
Preferred Skills
- Experience supporting infrastructure, hardware operations, supply chain, manufacturing, logistics, or capacity planning organizations.
- Familiarity with large-scale operational telemetry and business-critical reporting environments.
- Experience with distributed processing frameworks such as Spark.
- Experience with transformation frameworks such as dbt.
- Experience developing executive reporting and operational review metrics.
- Experience operating in fast-paced, ambiguous environments with evolving priorities.
Enterprise Application Data Architect, GTM Systems
Define and improve data architecture for GTM systems and enterprise CRM. Lead Salesforce data modeling, integrations, governance, and quality initiatives across the customer lifecycle.
Staff Engineer - Data Platform
Staff-level technical lead and architect for Haus's data ingestion and normalization platform. Owns schema evolution, data contracts, DQ, lineage, and observability in a GCP/BigQuery/dbt stack. Partners with DS and Product; mentors senior engineers.
Senior Staff Data Engineer
Lead multi-year vision and architecture for data governance and quality at scale. Define best practices, tooling, and culture while coaching senior engineers and influencing executive strategy on compliance and data integrity.
Senior Manager, Data Engineering
Lead the technical direction and team for Justworks' core data platform. Own architecture, infrastructure, and standards for pipelines, orchestration, and data governance while managing managers and ICs.
Staff Analytics Engineer
CodeRabbit is seeking a Staff Analytics Engineer to build and own their BigQuery and dbt data foundation. This role involves architecting the data warehouse, defining key metrics, building revenue models, and developing GTM intelligence layers.