Bioinformatics Engineer
Develop and optimize Nextflow-based bioinformatics pipelines for high-throughput sequencing analysis on Google Cloud Platform. Requires 3+ years of production pipeline experience, Nextflow proficiency, and strong genomics analysis skills.
Key Responsibilities
- Evaluate existing bioinformatics pipelines for performance, accuracy, and maintainability, identifying opportunities for optimization and enhancement
- Develop, extend, and maintain Nextflow workflows for bulk RNA-seq, single-cell RNA-seq, DNA variant calling, methylation analysis, and emerging assay types
- Architect scalable solutions capable of processing thousands of samples efficiently on Google Cloud Platform
- Optimize data input/output operations across pipeline modules to minimize bottlenecks and reduce computational costs
- Prepare and structure analysis outputs for visualization platforms and downstream statistical or machine learning applications
- Implement best practices for workflow versioning, containerization, testing, and documentation
- Collaborate with computational biologists, data scientists, and software engineers to integrate pipelines into broader analytical ecosystems
- Stay current with advances in sequencing technologies and analytical methods, evaluating and incorporating new tools as appropriate
Required Qualifications
- Master's degree in bioinformatics, computational biology, computer science, or a related field (or equivalent experience)
- 3+ years of hands-on experience developing and maintaining bioinformatics pipelines in a production environment
- Proficiency with Nextflow (DSL2) and familiarity with workflow management concepts
- Strong experience with RNA-seq analysis (both bulk and single-cell) and DNA sequencing workflows (variant calling, methylation)
- Working knowledge of Google Cloud Platform services (Compute Engine, Cloud Storage, Batch, Life Sciences API, or similar)
- Proficiency in Python and/or R for scripting, data manipulation, and tool development
- Experience with containerization technologies (Docker, Singularity)
- Familiarity with version control systems (Git) and CI/CD practices
- Strong understanding of genomic file formats (FASTQ, BAM, VCF, BED) and common bioinformatics tools (STAR, Salmon, BWA, GATK, Bismark, Cell Ranger, Seurat, Scanpy)
Preferred Qualifications
- PhD in a relevant field
- Experience with nf-core pipelines and community standards
- Familiarity with workflow orchestration at scale (Cromwell, AWS Batch, or similar platforms)
- Experience optimizing cloud costs and resource utilization for large-scale genomics workloads
- Knowledge of data visualization tools and frameworks (e.g., R Shiny, Plotly, custom dashboards)
- Experience in pharmaceutical, biotech, or regulated research environments
- Familiarity with FAIR data principles and metadata standards
Benefits
- 100% Medical, Dental & Vision Coverage for Employees
- Paid Time Off and Paid Holidays
- 401K match up to 5%
- Educational Benefits for Career Growth
- Employee Referral Bonus
- Flexible Spending Accounts: Healthcare (FSA), Parking Reimbursement Account (PRK), Dependent Care Assistant Program (DCAP), Transportation Reimbursement Account (TRN)
Sr. Data Engineer
Design and maintain scalable data pipelines and lake architecture on GCP/AWS to power analytics, trading tools, and ML initiatives. Requires 5+ years experience, strong SQL/Python, dbt, orchestration tools, and cloud infrastructure experience.
Senior Data Engineer - Agents Systems
Own and evolve streaming data pipelines and feature stores powering real-time AI inference and ML model serving. Requires 5+ years data engineering experience with 2+ years in production streaming systems.
Senior Data Engineer - AI Infrastructure
Own and evolve streaming data pipelines and feature stores powering real-time ML inference and model serving at Kraken. Requires 5+ years data engineering experience with 2+ years in production streaming systems.