NLM IRP Seminar Schedule

Seminars Home

Schedule Seminar

UPCOMING SEMINARS

RECENT SEMINARS

July 23, 2024 Yu group
Yu Group Research Update
July 18, 2024 Xiaofang Jiang
Jiang Lab research updates
May 30, 2024 Deepak Gupta
Towards Answering Health-related Questions from Medical Videos: Datasets and Approaches
May 28, 2024 Harutyun Saakyan
Simulation of protein fold evolution with atomistic details
May 23, 2024 Leslie Ronish
Identification of fold-switching proteins by FLIM-FRET

Scheduled Seminars on May 14, 2024

Speaker

Stanley Liang

Time

11 a.m.

Presentation Title

Knowledge-driven Latent Diffusion For COVID-19 Pneumonia Radiology Pattern Synthesis

Location

Virtual

Contact NLM_IRP_Seminar_Scheduling@mail.nih.gov with questions about this seminar.

Abstract:

Objective: Latent diffusion model (LDM) is the state-of-the-art method to synthesize medical image with designated knowledge. We propose a novel knowledge-driven strategy to establish the cross-modal binding between medical knowledge and target visual patterns related to COVID-19 pneumonia with an LDM model by a class prior preservation technique.

Method: We used the Stable Diffusion 2-1-base LDM pretrained with by large image datasets as the basic model for optimization. The LDM was respectively trained by a chest X-ray (CXR) image dataset with 2,599 frontal CXR images and a chest computed tomography (CT) image dataset with 104 CT scans of confirmed COVID-19 cases and 56 normal CT scans. When trained with the CXR images, the images in the CXR dataset were paired with the pattern identifier “bilateral lung edema mRALE 24” and the class identifier “chest x-ray”. When trained with the CT images, the images in the CT dataset were paired with the pattern identifier “COVID-19 pneumonia” and the class identifier “chest CT”. The model was optimized by an objective loss function combined with the class-specific prior preservation loss and the reconstruction loss to bind the medical concepts to the corresponding visual patterns via the CLIP text encoder and the VAE in the LDM architecture. We also synthesized images respectively using Wasserstein GAN with gradient penalty (WGAN-GP) and a pure denoising diffusion implicit model (DDIM) for quality comparison.

Results: After training, the synthetic CXR images generated by the combined text prompt “bilateral lung edema mRALE 24, chest x-ray” via the LDM have the Frechet inception distance (FID) of 9.2158 and kernel inception distance (KID) 0.0818 computed with the real positive CXR images, which indicates superior quality over other methods. The classification accuracy is 0.9975 with precision of 1.0 and recall of 0.9950 when the synthetic positive images with the real negative images were classified by a trained vision transformer (ViT). The synthetic CT images generated by the combined text prompt “COVID-19 pneumonia, chest CT” via the LDM have the Frechet Inception Distance (FID) of 7.99 and Kernel Inception Distance (KID) of 0.041 computed with the real positive CT slices, which also indicates superior quality over other methods. The synthetic CT images had the classification accuracy of 0.965, F1 of 0.963, recall of 0.930, and sensitivity of 0.930 when they were considered as COVID-19 positive and classified using a model trained with real CT images.

Conclusion: We conclude that the LDM can synthesize both high quality CXR and CT images with the designated COVID-19 pneumonia patterns using the proposed knowledge driven method. It provides a new approach for cross-modality knowledge representation with large vision models.

NLM IRP Seminar Schedule

UPCOMING SEMINARS

RECENT SEMINARS

Scheduled Seminars on May 14, 2024

Abstract:

ARCHIVES