Skip to content

Senior Software Engineer, Data Platform

230k – 265kSan Francisco, CAData EngineeringHybrid4+ YOE
Summary

Build and maintain scalable data pipelines and lakehouse infrastructure using PySpark, Databricks, and Airflow on AWS. Partner with Data Science and Engineering teams to enhance data quality, observability, and ML platform support. Requires 4+ years experience with Python, SQL, and cloud data stacks.

About the role

What You’ll Do

  • Design and build robust, highly scalable data pipelines and lakehouse infrastructure with PySpark, Databricks, and Airflow on AWS.
  • Improve the data platform development experience for Engineering, Data Science, and Product by creating intuitive abstractions, self‑service tooling, and clear documentation.
  • Own and maintain core data pipelines and models that power internal dashboards, ML models, and customer-facing products.
  • Own the Data & ML platform infrastructure using Terraform, including end‑to‑end administration of Databricks workspaces: manage user access, monitor performance, optimize configurations (e.g., clusters, lakehouse settings), and ensure high availability of data pipelines.
  • Lead projects to improve data quality, testing, observability, and cost efficiency across existing pipelines and backend systems (e.g., migrating Databricks SQL pipelines to dbt, scaling data ingestion, improving data-lineage tracking, and enhancing monitoring).
  • Act as the primary engineering partner for the Data Science team—embedded closely to gather requirements, design scalable solutions, and provide end-to-end support on all engineering aspects of their work.
  • Work closely with backend engineers and data scientists to design performant data models and support new product development initiatives.
  • Share best practices and mentor other engineers working on data-centric systems.

What We’re Looking For

  • 4+ years of experience in software engineering with a strong background in data infrastructure, pipelines, and distributed systems.
  • Advanced proficiency in Python and SQL.
  • Hands-on Spark development experience.
  • Expertise with modern cloud data stacks—AWS (S3, RDS), Databricks, and Airflow—and lakehouse architectures.
  • Hands‑on experience with foundational data‑infrastructure technologies such as Hadoop, Hive, Kafka (or similar streaming platforms), Delta Lake/Iceberg, and distributed query engines like Trino/Presto.
  • Familiarity with ingestion frameworks, developer‑experience tooling, and best practices for data versioning, lineage, partitioning, and clustering.
  • Strong problem-solving skills and a proactive attitude toward ownership and platform health.
  • Excellent communication and collaboration skills, especially in cross-functional settings.

Bonus Points

  • Experience with AWS infrastructure using Terraform.
  • Familiarity with observability tools (e.g., Datadog) and cost tracking in cloud environments.
  • Experience with financial systems or building platforms in a fintech setting.
  • Prior work on ML infrastructure: Feature stores (e.g., Tecton), ML model lifecycle (training, deployment, monitoring, retraining), real-time inference.
  • Contributions to internal tooling or open-source projects in the data ecosystem.

What We Offer

Salary Range: $230k-$265k

  • Equity grant
  • Medical, dental & vision insurance
  • Work from home flexibility
  • Unlimited PTO
  • Commuter benefits
  • Free lunches
  • Paid parental leave
  • 401(k)
  • Employee assistance program
Skills
PythonSQLPySparkDatabricksAirflowAWSTerraformSparkKafkaDelta LakedbtTrinoHadoopHiveDatadog
Similar roles at this salary range
All Data Engineering jobs →
OpenAI

Enterprise Application Data Architect, GTM Systems

Define and improve data architecture for GTM systems and enterprise CRM. Lead Salesforce data modeling, integrations, governance, and quality initiatives across the customer lifecycle.

260k – 288kSan Francisco, CAData EngineeringHybrid7+ YOESQLETL
Okta

Staff Software Engineer, Data Platform

Staff Software Engineer building and scaling high-volume, low-latency distributed data platform services and analytics infrastructure using Java, Kinesis, Flink, Snowflake, and Kubernetes. Requires 8+ years experience and U.S. Person status for FedRAMP access.

194k – 267kSan Francisco, CAData EngineeringHybrid8+ YOEAWSJava
Haus

Staff Engineer - Data Platform

Staff-level technical lead and architect for Haus's data ingestion and normalization platform. Owns schema evolution, data contracts, DQ, lineage, and observability in a GCP/BigQuery/dbt stack. Partners with DS and Product; mentors senior engineers.

240k – 260kSan Francisco, CA +2Data EngineeringHybrid8+ YOESQLdbt
Headway

Staff Data Infrastructure Engineer

Staff-level Data Infrastructure Engineer to architect and evolve the data platform (Snowflake, ingestion, orchestration, CI/CD, AWS infra) serving analytics, product, and ML teams. Requires 10+ years building scalable data platforms and proven technical leadership.

212k – 265kNew York, NY +2Data EngineeringHybrid10+ YOEAWSSQL
Headway

Senior Manager, Data Engineering

Lead and scale Headway's data engineering team, owning architecture for data warehouse, pipelines, dbt transformations, and orchestration to power analytics, ML, and operations. Requires 8+ years data engineering experience and 3+ years managing teams.

212k – 265kNew York, NY +2Data EngineeringHybrid8+ YOEdbtData Modeling