Senior Staff Data Engineer
United StatesData EngineeringRemote8+ YOE
Summary
Senior data engineer defining long-term data strategy, designing scalable pipelines, and leading cross-functional initiatives. Requires 8+ years experience, strong PySpark/SQL/Python skills, and expertise in Snowflake, Spark, Airflow, and dbt.
About the role
Architecture & Technical Strategy
- Define the long-term data engineering strategy guided by company-wide priorities and engineering best practices.
- Create coherent designs across multiple pipelines and API boundaries. Reduce complex concepts to foundational components and simplify infrastructure to lower maintenance costs.
- Make high-impact technical choices—including "build vs. buy" and framework selections—based on sound reasoning. Review designs to preemptively identify and resolve technical risks.
- Implement solutions that measurably improve developer efficiency and establish engineering-wide quality and best practices.
Execution & Business Impact
- Roll out major features and systems reliably, including appropriate monitoring, failure domain characterization, and success metric definitions.
- Leverage a deep understanding of SmithRx’s business strategy to identify group-wide opportunities. Proactively refocus team efforts when projects are off-course or not moving the needle for the business.
- Enforce data governance policies (PII/PHI protection, security, compliance) and implement data quality principles to raise the bar for the reliability of data shared internally and externally.
Leadership & Collaboration
- Influence the roadmaps of other SmithRx teams. Act thoughtfully and decisively in critical situations, seeking diverse perspectives but ultimately leading decision-making to move priorities forward.
- Serve as a role model and coach for other engineers, taking into account their unique skills and providing constructive feedback to maximize their impact.
- Develop focused messaging and effectively present technical strategies and business cases at the executive level.
- Break down silos, build deep cross-functional relationships, and create excitement to drive the adoption of new technologies or processes across the organization.
Requirements
- 8+ years of industrial experience in data engineering with an advanced degree or 12+ years with an undergraduate degree in Computer Science, Information Technology, or a related field (start-up and healthcare experience highly desirable).
- Demonstrated mastery of data modeling concepts, database design principles, and data warehouse technologies (e.g., Snowflake) through production-grade implementations.
- Strong skills in PySpark, SQL, and Python required. Experience in modern object-oriented or compiled languages such as C#/C++, Go, Java, or Scala is a plus.
- Hands-on experience with leading ETL tools and frameworks (e.g., Apache Spark, Apache Airflow, dbt, Looker, Superset).
- In-depth experience managing the entire data lifecycle, with direct responsibility for the development, implementation, and production release of complex data processing solutions utilizing distributed systems.
- Proven track record of making decisions optimized for the wider engineering organization rather than locally optimal outcomes, especially in environments with significant ambiguity.
Benefits
- Highly competitive wellness benefits including Medical, Pharmacy, Dental, Vision, and Life Insurance and AD&D Insurance
- Flexible Spending Benefits
- 401(k) Retirement Savings Program
- Short-term and long-term disability
- Discretionary Paid Time Off
- 12 Paid Company Holidays
- Wellness Benefits
- Commuter Benefits
- Paid Parental Leave benefits
- Employee Assistance Program (EAP)
- Professional development and training opportunities
Skills
PySparkSQLPythonSnowflakeApache SparkApache AirflowdbtLookerSupersetC#C++GoJavaScala