Skip to content

Data Engineer II

CaliforniaArizonaColoradoFloridaData EngineeringRemote5+ YOE
Summary

Builds and maintains scalable data pipelines, develops ETL/ELT workflows, and optimizes data warehousing using tools like Airflow, Spark, and AWS. Requires 5+ years experience with Python, SQL, and distributed query engines for cross-functional collaboration.

About the role

Key Responsibilities

  • Build & Maintain Data Pipelines: Develop and maintain scalable data pipelines to ingest, process, and transform data from multiple sources.
  • ETL Development: Support the design and optimization of ETL/ELT workflows to ensure efficient and reliable data delivery.
  • Workflow Orchestration: Work with tools like Apache Airflow to schedule and manage data workflows.
  • Data Quality & Reliability: Help implement data quality checks, validation processes, and monitoring to ensure accuracy and consistency.
  • Data Modeling & Warehousing: Contribute to data modeling efforts, schema design, and data warehouse optimization.
  • Query & Processing Frameworks: Utilize tools such as Trino (Presto), Apache Spark, or similar technologies to support distributed data processing.
  • Infrastructure & Performance Optimization: Assist in improving performance, scalability, and cost-efficiency of data systems using modern cloud platforms (AWS).
  • Cross-Functional Collaboration: Partner with stakeholders across engineering, product, and business teams to understand data needs and deliver solutions.
  • Monitoring & Troubleshooting: Identify issues in pipelines and workflows, troubleshoot effectively, and implement long-term fixes.

Qualifications

Experience

  • 5+ years of experience in data engineering, data infrastructure, or related roles

Technical Skills

  • Experience with Python or similar languages for data processing
  • Familiarity with SQL and distributed query engines (e.g., Trino/Presto)
  • Exposure to Apache Spark or similar processing frameworks
  • Experience with workflow orchestration tools (e.g., Apache Airflow)

Cloud & Data Platforms

  • Experience with AWS services such as S3, Redshift, Glue, or Athena

Data Fundamentals

  • Understanding of data modeling, warehousing concepts, and schema design

Data Quality & Governance

  • Familiarity with data validation, quality checks, and governance best practices

Problem Solving

  • Strong analytical mindset with the ability to troubleshoot and optimize data systems

Communication

  • Ability to communicate clearly with both technical and non-technical stakeholders

Nice to Have

  • Experience with modern data formats (e.g., Apache Iceberg)
  • Exposure to BI/visualization tools (e.g., Looker, Tableau)
  • Experience working in a SaaS or product-driven environment
Skills
PythonSQLApache SparkApache AirflowTrinoPrestoAWSS3RedshiftGlueAthenaETLELTData ModelingApache Iceberg