NLM IRP Seminar Schedule

UPCOMING SEMINARS

RECENT SEMINARS

Scheduled Seminars on June 28, 2022

Speaker
Sofya Garushyants
Time
11 a.m.
Presentation Title
Do sgRNAs affect the way we think about SARS-CoV-2 mutations?
Location
Virtual - see link in seminar

Contact NLM_IRP_Seminar_Scheduling@mail.nih.gov with questions about this seminar.

Abstract:

The first SARS-CoV-2 genome sequence was published in January 2020, less than a month after virus isolation. Since then, more than five million high quality genomes were deposited in GenBank and GISAID. Genome analysis became the key tool through COVID-19 pandemics in search for new variants of concern and investigation of virus evolution. In order to obtain SARS-CoV-2 genomic sequence, the total RNA is extracted from the specimen and then amplified leading to the production of up to 100 overlapping short amplicons that are used to reconstruct the genomic sequence. This approach was successfully applied before to monitor influenza and other viruses. However, while most viruses have mechanisms for internal translation initiation, coronaviruses employ a unique mechanism of transcription that yields subgenomic (sg) RNAs of different lengths. While ORF1ab is translated from the genomic RNA, all other genes are translated from their own sgRNAs. Each sgRNA contains a leader sequence and starts at the beginning of one of the accessory genes and ends at 3’-end of the genome. While SARS-CoV-2 genomes are packed into capsids and are inherited, sgRNAs are, to the best of current knowledge, not transferred from host to host. Moreover, sgRNAs have been shown to be sometimes thousand times more abundant than the genomic RNA during infection, which means that during sequencing, the 3’-terminal portion of the genome is mostly represented by sgRNAs. However, although a mix of genomic RNA and sgRNA is routinely sequenced, the contribution of sgRNAs to the observed genomic variants (if any) has not been investigated.
By analyzing the available public data, we show that presence of sgRNAs affects low frequency variants observed in patients. Furthermore, allele frequency in sgRNA can differ from that in the genome. We also found examples of high frequency variants that make it into the genome consensus that are only present in sgRNAs, and not in the genome. Taken together, these findings show that sgRNAs affect variant calling for SARS-CoV-2 genome sequences and imply that viral RNA from only small number of infected cells is typically sequenced.