Senior Software Engineer - Data Infrastructure

Builds and scales data infrastructure including warehouses, lakehouses, Spark pipelines, streaming, and orchestration to enable data-driven decisions and ML at Plaid. Requires 5+ years experience in data platforms, strong system design, and leadership.

191k – 287kSan Francisco, CAData EngineeringHybrid5+ YOE

Apply

About the role

Responsibilities

Contribute towards the long-term technical roadmap for data-driven and machine learning iteration at Plaid.
Lead key data infrastructure projects such as improving ML development golden paths, implementing offline streaming solutions for data freshness, building net new ETL pipeline infrastructure, and evolving data warehouse or data lakehouse capabilities.
Work with stakeholders in other teams and functions to define technical roadmaps for key backend systems and abstractions across Plaid.
Debug, troubleshoot, and reduce operational burden for our Data Platform.
Grow the team via mentorship and leadership, reviewing technical documents and code changes.

Qualifications

5+ years of software engineering experience.
Extensive hands-on software engineering experience, with a strong track record of delivering successful projects within the Data Infrastructure or Platform domain at similar or larger companies.
Deep understanding of one of: ML Infrastructure systems, including Feature Stores, Training Infrastructure, Serving Infrastructure, and Model Monitoring OR Data Infrastructure systems, including Data Warehouses, Data Lakehouses, Apache Spark, Streaming Infrastructure, Workflow Orchestration.
Strong cross-functional collaboration, communication, and project management skills, with proven ability to coordinate effectively.
Proficiency in coding, testing, and system design, ensuring reliable and scalable solutions.
Demonstrated leadership abilities, including experience mentoring and guiding junior engineers.

Nice to Have

Experience with Databricks, Airflow, AWS EMR.

Skills

SparkData WarehouseData LakehouseStreaming InfrastructureWorkflow OrchestrationDatabricksAirflowAws EmrML InfrastructureETLFeature StoresTraining Infrastructure

Similar roles

Data Engineering jobs

Senior Analytics Engineer

Lead analytics engineering for Reddit's Sales and Marketing teams, building scalable data pipelines, ETLs, dashboards, and self-service tools to empower data-driven decision making. Requires 4-5+ years experience with large-scale ETL systems, Python/SQL, and data modeling; advanced quantitative degree required.

191k – 267kUnited StatesData EngineeringRemote5+ YOED3SQL

Plaid

Senior Data Engineer - Data Engineering

Builds and owns scalable SQL/Python data pipelines, golden datasets, and workflows using DBT, Airflow, Redshift for large-scale data (500TB+). Collaborates cross-functionally to enable data-driven decisions at Plaid. Requires 4+ years data engineering experience.

191k – 287kSan Francisco, CAData EngineeringHybrid4+ YOESQLdbt

Sentry

Senior Software Engineer, Events Analytics Platform

Senior backend/infrastructure engineer expanding Sentry's time-series data platform (Snuba/ClickHouse) to handle petabyte-scale events with sub-second latency. Requires 4+ years experience and distributed storage expertise.

190k – 280kSan Francisco, CAData EngineeringHybrid4+ YOERedisKafka

Jellyfish

Senior Data Engineer

Jellyfish is seeking a Senior Data Engineer to build, automate, and execute the next generation of their data platform. The role involves maintaining end-to-end data pipelines, modernizing orchestration, and automating data infrastructure.

190k – 240kUnited StatesData EngineeringRemoteSQLdbt

Sage

Lead Data Product Engineer

Leads development and architecture of client-facing data platform using Palantir Foundry in a low/no-code environment. Collaborates with Product and Design teams, applies software engineering best practices, and requires 7+ years experience with bachelor's in quantitative field.

190k – 225kNew York, NYData EngineeringHybrid7+ YOEHIPAAPython