NLM IRP Seminar Schedule



Scheduled Seminars on April 25, 2024

Ermin Hodzic
3 p.m.
Presentation Title
Condition-Aware Cell Type Deconvolution of Bulk Tissues

Contact with questions about this seminar.


Transcriptome analysis is a key tool allowing to investigate healthy and diseased tissues at the molecular level. While single-cell RNA sequencing offers valuable insights into cell types and states, complex sample preparation procedures and higher cost restrict its widespread adoption compared to the older bulk RNA sequencing, which is already widely established, has lower cost, and exhaustive population-level data collections available. In addition, newer technology in form of spatial transcriptomics, which offers locality-based insights, essentially produces data from many thousands of localized bulk mixtures. However, bulk expression data comprise a mixture of heterogeneous cell types and capture average expression. Thus, deconvolving bulk mixtures and inferring cell type populations from bulk expression, remains indispensable.

Many computational methods have been developed to infer cell type proportions from bulk data, generally with the use of reference data based on single-cell sequencing, which guides the process. However, technological inconsistencies between the bulk mixtures and the reference affect the accuracy of such approaches. Moreover, medical conditions are also associated with tissue reprogramming, possibly resulting in changes in cell type composition.

In this talk, a new model for cell type deconvolution is introduced; to our knowledge the first one to offer condition-aware cell type deconvolution. It allows both the incorporation of a quantitative condition that may have a sample-specific effect on expression of certain genes in certain cell types, as well as an implicit mechanism of correction for inconsistencies between the reference and the bulk mixtures. We give an efficient method to solve the model, inferring both cell type proportions as well as the trend of the influence of the quantitative condition on genes expression in cell type populations. Our benchmarks demonstrate the increased accuracy of this model over more basic models, and increased resilience to inconsistencies between the reference and the bulk expression.