Data Engineer II
CaliforniaArizonaColoradoFloridaData EngineeringRemote5+ YOE
Summary
Builds and maintains scalable data pipelines, develops ETL/ELT workflows, and optimizes data warehousing using tools like Airflow, Spark, and AWS. Requires 5+ years experience with Python, SQL, and distributed query engines for cross-functional collaboration.
About the role
Key Responsibilities
- Build & Maintain Data Pipelines: Develop and maintain scalable data pipelines to ingest, process, and transform data from multiple sources.
- ETL Development: Support the design and optimization of ETL/ELT workflows to ensure efficient and reliable data delivery.
- Workflow Orchestration: Work with tools like Apache Airflow to schedule and manage data workflows.
- Data Quality & Reliability: Help implement data quality checks, validation processes, and monitoring to ensure accuracy and consistency.
- Data Modeling & Warehousing: Contribute to data modeling efforts, schema design, and data warehouse optimization.
- Query & Processing Frameworks: Utilize tools such as Trino (Presto), Apache Spark, or similar technologies to support distributed data processing.
- Infrastructure & Performance Optimization: Assist in improving performance, scalability, and cost-efficiency of data systems using modern cloud platforms (AWS).
- Cross-Functional Collaboration: Partner with stakeholders across engineering, product, and business teams to understand data needs and deliver solutions.
- Monitoring & Troubleshooting: Identify issues in pipelines and workflows, troubleshoot effectively, and implement long-term fixes.
Qualifications
Experience
- 5+ years of experience in data engineering, data infrastructure, or related roles
Technical Skills
- Experience with Python or similar languages for data processing
- Familiarity with SQL and distributed query engines (e.g., Trino/Presto)
- Exposure to Apache Spark or similar processing frameworks
- Experience with workflow orchestration tools (e.g., Apache Airflow)
Cloud & Data Platforms
- Experience with AWS services such as S3, Redshift, Glue, or Athena
Data Fundamentals
- Understanding of data modeling, warehousing concepts, and schema design
Data Quality & Governance
- Familiarity with data validation, quality checks, and governance best practices
Problem Solving
- Strong analytical mindset with the ability to troubleshoot and optimize data systems
Communication
- Ability to communicate clearly with both technical and non-technical stakeholders
Nice to Have
- Experience with modern data formats (e.g., Apache Iceberg)
- Exposure to BI/visualization tools (e.g., Looker, Tableau)
- Experience working in a SaaS or product-driven environment
Skills
PythonSQLApache SparkApache AirflowTrinoPrestoAWSS3RedshiftGlueAthenaETLELTData ModelingApache Iceberg