Data Science Fellow - AI/NLP
Postdoctoral researcher developing AI/NLP and knowledge engineering methods to transform biomedical literature into structured, evidence-grounded knowledge for organoid protocol standardization. Requires PhD and strong Python/NLP research experience.
Responsibilities
- Design and implement AI/NLP methods for biomedical literature mining and structured protocol knowledge extraction.
- Develop benchmark datasets, annotation guidelines, and evaluation pipelines for scientific information extraction.
- Build and evaluate RAG, in-context learning, fine-tuning, graph matching, entity normalization, and KG query workflows.
- Analyze extraction errors, model behavior, retrieval failures, grounding quality, and biological ambiguity.
- Collaborate with software engineers to integrate research methods into usable tools and reproducible pipelines.
- Collaborate with organoid biologists and domain experts to translate biological protocol knowledge into computable representations.
- Prepare manuscripts, conference abstracts, technical reports, design documents, and open-source research artifacts.
- Help define research milestones, evaluation criteria, and publication strategy for protocol intelligence work.
Requirements
- PhD in computer science, computational biology, bioinformatics, biomedical informatics, NLP, machine learning, data science, or a related field.
- Strong Python programming skills.
- Demonstrated research experience with NLP, information extraction, LLMs, RAG, transformers, structured prediction, or scientific text mining.
- Ability to design controlled computational experiments, create benchmark datasets, and analyze results rigorously.
- Familiarity with biological, biomedical, or scientific data.
- Strong written communication skills and interest in publishing methods-oriented research.
- Comfort working with complex, evolving research codebases and interdisciplinary teams.
Preferred Qualifications
- Experience with scientific document processing, PDF parsing, biomedical literature mining, or methods-section extraction.
- Experience with knowledge graphs, ontologies, graph databases, graph algorithms, or semantic data modeling.
- Hands-on experience with fine-tuning LLMs, LoRA/QLoRA, Hugging Face, PyTorch, or API-based model evaluation.
- Hands-on experience with prompt engineering, structured JSON extraction, schema validation, tool use, or agentic LLM workflows.
- Hands-on experience with RAG systems, vector search, graph-augmented retrieval, or natural-language query over structured data.
- Exposure to bioinformatics concepts (e.g., sequence alignment, clustering, or phylogenetic analysis).
- Background in stem cell biology, organoids, developmental biology, wet-lab protocols, or biological assays.
Data Scientist
As a Data Scientist, you will shape the future of Pinterest's products by applying quantitative modeling, experimentation, and algorithms to complex engineering challenges. You will collaborate with cross-functional partners to bring scientific rigor to product development and deliver insights that influence product teams.
Bioinformatics/Data Scientist
Conduct multi-omics analyses (scRNA-seq, bulk RNA-seq, proteomics, metabolomics) on organoid systems to characterize fidelity and identify biomarkers. Develop computational pipelines, integrate public datasets, and collaborate with experimental teams. Requires a PhD in bioinformatics or related field with strong R/Python skills.
Contract Data Scientist
Develops and iterates ML/LLM models for clinical use cases in healthcare, conducts error analysis, and ships production-ready solutions. Requires strong Python skills, data science fundamentals, and familiarity with modern LLM techniques.
Scientist - Ensemble Structural Informatics
Leads development of standards and validation frameworks for dynamic structural biology data, focusing on ensemble models from X-ray crystallography and cryo-EM. Collaborates with engineers to build deposition, search, and retrieval infrastructure for the diffUSE Project. Requires PhD in structural biology or related field.