Senior Data Engineer - AI Infrastructure
Own and evolve streaming data pipelines and feature stores powering real-time ML inference and model serving at Kraken. Requires 5+ years data engineering experience with 2+ years in production streaming systems.
Responsibilities
- Own and evolve streaming data pipelines that power live inference and real-time model serving across Kraken's AI infrastructure
- Design and build feature stores that serve low-latency, high-reliability features to production ML models
- Implement and maintain streaming systems using RisingWave, Apache Flink, or Kafka Streams, selecting the right tool for the workload
- Partner with ML engineers and AI infra teams to define data contracts, feature schemas, and pipeline SLAs
- Drive pipelines toward real-time where batch exists today reducing latency from hours to seconds
- Ensure data quality, observability, and auditability across all streaming and feature engineering systems
- Contribute to inference pipeline tooling where data engineering and model serving intersect
- Evaluate emerging streaming and feature store technologies and shape the team's technical roadmap
Requirements
- 5+ years in data engineering with at least 2 years focused on streaming systems in production
- Hands-on experience with RisingWave, Apache Flink, Kafka Streams, or comparable stream processing frameworks
- Strong understanding of feature store design — online/offline consistency, point-in-time correctness, low-latency serving
- Experience building data pipelines that feed production ML models or inference systems
- Proficiency in Python and/or Scala; SQL fluency required
- Familiarity with data quality frameworks, pipeline observability, and SLA ownership
- Comfortable operating in a fast-moving, ambiguous environment embedded within an AI-focused team
Nice to Haves
- Direct experience with RisingWave in production
- Exposure to inference pipeline architecture or model serving infrastructure
- Experience with feature platforms
- Crypto or fintech domain experience
Sr. Data Engineer
Design and maintain scalable data pipelines and lake architecture on GCP/AWS to power analytics, trading tools, and ML initiatives. Requires 5+ years experience, strong SQL/Python, dbt, orchestration tools, and cloud infrastructure experience.
Senior Data Engineer - Agents Systems
Own and evolve streaming data pipelines and feature stores powering real-time AI inference and ML model serving. Requires 5+ years data engineering experience with 2+ years in production streaming systems.
Bioinformatics Engineer
Develop and optimize Nextflow-based bioinformatics pipelines for high-throughput sequencing analysis on Google Cloud Platform. Requires 3+ years of production pipeline experience, Nextflow proficiency, and strong genomics analysis skills.
Computer Systems Analyst
Develop specialized computer software and high-performance computing solutions to analyze experimental imaging and simulation data for NIH research projects. Requires experience with machine learning, image analysis, statistical methods, and scientific computing support.