Staff Data Engineer
Designs and evolves scalable Iceberg-based lakehouse architecture, metadata governance, and security controls for analytics, product, and AI systems. Requires 12+ years experience with Python, SQL, Airflow, and major cloud platforms.
Responsibilities
Technical Leadership & Architecture
- Design and evolve the Iceberg-based lakehouse architecture to balance scalability, cost, performance, and maintainability.
- Define and promote standards for table design, partitioning, schema evolution, optimization, and data layout.
- Lead architectural efforts spanning batch, streaming, and event-driven data processing where they deliver business value.
- Drive the design and delivery of complex, cross-team initiatives, enabling teams to move independently within established architectural guidance.
- Evaluate and integrate technologies (build vs. buy).
Metadata, Governance & Open Standards
- Define how datasets, pipelines, features, and models are described, related, and governed using shared metadata.
- Lead the adoption and integration of open-source metadata and catalog tools (e.g., OpenMetadata).
- Establish metadata standards that enable self-service analytics, governance, and AI readiness.
- Partner with BI and Analytics to ensure domain models are clearly documented and aligned to business language.
- Collaborate with Data Science to ensure model inputs, features, and outputs are traceable, explainable, and reusable.
Security, Access Control & Compliance
- Design and evolve security and access-control models for Apache Iceberg, including table-, column-, and row-level controls.
- Partner with Security and Platform teams to embed policy enforcement directly into data access paths.
- Drive metadata-driven authorization patterns that scale across tools and user groups.
- Ensure privacy, compliance, and regulatory requirements are incorporated into platform design.
- Balance strong security guarantees with usability to support safe self-service.
Platform Reliability & Operations
- Build and maintain automation for compaction, retention, lifecycle management, and cost controls.
- Establish observability standards that connect pipeline health, data quality, and reliability metrics.
- Provide architectural oversight during critical incidents and drive long-term 'Keep the Lights On' (KTLO) reduction.
- Recommend tooling and process improvements based on industry standards and operational experience.
Organizational Impact & Collaboration
- Align technical work with business priorities by understanding how data supports onX products and customer outcomes.
- Communicate complex technical concepts clearly to engineers, product partners, and leadership.
- Lead and participate in architecture and design reviews, setting a high bar for technical rigor.
- Foster strong cross-team collaboration across Data Engineering, Platform, Security, Analytics, and Data Science.
- Mentor senior and mid-level engineers, raising the technical bar across the team.
Requirements
Required
- Bachelor’s degree in Computer Science or equivalent experience.
- Deep industry experience (typically 12+ years) building and operating large-scale data systems.
- Deep expertise in distributed data systems and data architecture.
- Strong experience with Apache Iceberg and similar table formats (Delta Lake, Hudi).
- Proven experience designing secure and governed data platforms.
- Expertise in Python, SQL, and orchestration patterns (e.g., Airflow).
- Experience working with data ecosystems, including metadata, catalog, or governance tooling.
- Strong written and verbal communication skills.
- Permanent U.S. work authorization.
Cloud & Platform Experience
- Deep experience in at least one major cloud environment (GCP, AWS, or Azure).
- Familiarity with cloud-native data services such as query engines, stream/batch processing systems, and object storage–based lakehouses.
- Comfort with infrastructure-as-code and automated platform management.
Compensation
- Base salary: $175,000 - $218,000 upon hire (varies based on experience, skills, certifications, and education).
- Full-time employees eligible for common share options (vesting schedule) and potential annual bonus of 10% based on company performance.
Software Engineer, Data Platform
Build and maintain data infrastructure processing petabytes of data. Own end-to-end projects for data ingestion, transformation, and serving systems. Requires 3+ years of software engineering experience.
Staff Analytics Engineer
Design and maintain a robust business data layer in dbt to enable trusted GTM sales analytics, reporting, data science, and AI capabilities. Requires 8+ years in analytics engineering with advanced SQL and dbt expertise.
Data Engineer
Own and extend customer data ingestion platform and large-scale pipelines powering AI workers. Build data lake, retrieval layer, and infrastructure for syncing, enriching, and querying customer data across CRMs and third-party systems.
Staff Software Engineer, Data Platform
Staff Software Engineer building and scaling high-volume, low-latency distributed data platform services and analytics infrastructure using Java, Kinesis, Flink, Snowflake, and Kubernetes. Requires 8+ years experience and U.S. Person status for FedRAMP access.