NLM DIR Seminar Schedule
UPCOMING SEMINARS
-
Feb. 18, 2025 Samuel Lee
Efficient predictions of alternative protein conformations by AlphaFold2-based sequence association -
Feb. 25, 2025 Zhizheng Wang
GeneAgent: Self-verification Language Agent for Gene Set Analysis using Domain Databases -
March 4, 2025 Sofya Garushyants
TBD -
March 11, 2025 Sanasar Babajanyan
TBD -
March 18, 2025 MG Hirsch
TBD
RECENT SEMINARS
-
Feb. 11, 2025 Po-Ting Lai
Enhancing Biomedical Relation Extraction with Directionality -
Feb. 4, 2025 Victor Tobiasson
On the dominance of Asgard contributions to Eukaryogenesis -
Jan. 28, 2025 Kaleb Abram
Leveraging metagenomics to investigate the co-occurrence of virome and defensome elements at large scale -
Jan. 21, 2025 Qiao Jin
Artificial Intelligence for Evidence-based Medicine -
Jan. 17, 2025 Xuegong Zhang
Using Large Cellular Models to Understand Cell Transcriptomics Language
Scheduled Seminars on Jan. 17, 2025
In-person: Building 38A/B2N14 NCBI Library or Zoom
Contact NLMDIRSeminarScheduling@mail.nih.gov with questions about this seminar.
Abstract:
Large language models (LLMs) pretrained on massive data have shown their power as foundation models for pervasive tasks in natural language understanding and beyond. This inspired us to develop large cellular models (LCMs) to decipher the transcriptomic language of cells. We have developed LCMs for single-cell transcriptomics toward this goal using two approaches, which produced the two large models scFoundation and scMulan. With pretraining on tens of millions of human scRNA-seq data covering almost all known cell types and states, the models have shown ability of capturing complex context relations among gene expressions and meta attributes of cells. Experiments showed that the pretrained model can achieve state-of-the-art performances in zero-shot manner or with light fine-tuning on a diverse array of single-cell analysis tasks such as data enhancement, drug-response prediction at tissue and single-cell levels, single-cell perturbation prediction, cell type annotation, gene module inference and conditional cell generation.