Sr. Data Engineer
AlabamaArizonaCaliforniaColoradoData EngineeringRemote5+ YOE
Summary
Builds and maintains scalable data pipelines, ETL/ELT processes, and workflows using Airflow, Spark, Trino, and AWS services to support analytics and product innovation. Requires 5+ years experience in data engineering with Python proficiency.
About the role
Key Responsibilities
- Build & Maintain Data Pipelines: Design, implement, and maintain scalable data pipelines using modern data tools to process and manage large datasets efficiently.
- ETL Development: Develop and optimize ETL/ELT pipelines to ingest, transform, and deliver data from multiple internal and external sources.
- Workflow Orchestration: Build and manage workflows using Apache Airflow to ensure reliable scheduling and monitoring of data processes.
- Query Engines & Processing Frameworks: Leverage tools such as Trino (Presto), Apache Spark, and related distributed processing technologies to support analytics and data applications.
- Data Modeling & Warehousing: Contribute to schema design and data modeling efforts to ensure clean, well-structured, and scalable data architecture.
- Data Quality & Governance Support: Implement monitoring, validation checks, and best practices to ensure data accuracy, consistency, and reliability.
- Optimize Data Infrastructure: Utilize AWS services (S3, Redshift, Glue, Athena, Lambda) and modern data technologies (e.g., Apache Iceberg) to support a scalable and efficient data platform.
- Cross-Functional Collaboration: Partner with engineering, product, analytics, and business teams to understand requirements and deliver high-quality data solutions.
- Monitor & Improve Performance: Proactively monitor pipelines and workflows, troubleshoot issues, and continuously improve performance and reliability.
Qualifications
- 5+ years of experience building and maintaining data pipelines or working in data engineering or related roles.
- Hands-on experience with data tools such as Apache Airflow, Apache Spark, Apache Iceberg, Trino/Presto, and AWS services (S3, Redshift, Glue, Athena, Lambda).
- Proficiency in Python (or similar language) for data processing and pipeline development.
- Solid understanding of data warehousing concepts, schema design, and data modeling best practices.
- Experience deploying and supporting data pipelines in production environments.
- Strong analytical skills and ability to diagnose and resolve data-related issues.
- Ability to communicate effectively with both technical and non-technical stakeholders and work in cross-functional teams.
Bonus Points
- Experience with visualization tools such as Looker or Tableau.
- Exposure to data governance, privacy, and quality frameworks.
- Familiarity with CI/CD practices and version control for data workflows.
Skills
Apache AirflowApache SparkApache IcebergTrinoPrestoAWS S3AWS RedshiftAWS GlueAWS AthenaAWS LambdaPythonETLELTData WarehousingData Modeling