Key Insights from the Genome Analysis of the Novel Coronavirus

Key Insights from the Genome Analysis of the Novel Coronavirus

Ever since the first reports of COVID-19 in Wuhan, China, there has been considerable discussion and "debates" on the origin of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) - the causative virus of COVID-19.
In January this year, China shared the full RNA sequence of the novel virus publicly, which provided the base to researchers around the globe to initiate work and burrow deep into the virus.

The SARS-CoV-2 genome is an RNA molecule comprising 30,000 bases containing 15 genes, including the S gene, which codes for a protein located on the surface of the viral envelope. Genome analysis is being carried out in many parts of the world to characterize alterations in the genetic information of the virus. Sequencing viral genomes can help in understanding the variation of the virus, and conclusions can be drawn about their origin and different lineages of the virus in the population.

In this article, we discuss some key insights deciphered from the genome analysis of SARS-CoV-2 or the novel coronavirus.

Whole-genome sequencing

Whole-genome sequencing (WGS) is the process of analyzing the entire DNA sequence of an organism's genome at a single time. Whole-genome sequencing can provide critical information and has the potential to revolutionize infectious disease management. Researchers can locate and identify the genetic changes that occur in the virus as it spreads through the population. This approach is useful to

Understand the transmission of the virus

  • Design treatments and vaccines

  • Monitor viral evolution

  • Prepare for the future

Phylogenetic network analysis to predict future global hot spots

Researchers from Cambridge, UK, and Germany conducted a phylogenetic network analysis of 160 complete SARS-Cov-2 genomes. Data from virus genomes sampled from across the world between 24 December 2019 and 4 March 2020 were included in the study. Through this analysis, the researchers reconstructed the early evolutionary paths of COVID-19 in humans -- as infection spread from Wuhan out to Europe and North America. The researchers mapped some of the original spread of the new coronavirus through its mutations responsible for different viral lineages

The research revealed three distinct "variants" of COVID-19, with clusters of closely related lineages; they labeled the variants as 'A,' 'B' and 'C.'

  • Type 'A' – The type with the "original human virus genome" and the closest type of COVID-19 to the one discovered in bats. Type A was present in Wuhan but was not the city's primary virus type. Mutated versions of 'A' were observed in Americans reported to have lived in Wuhan, and many A-type viruses were found in patients from the US and Australia.

  • Type 'B' – It was Wuhan's primary virus type 'B.' Type 'B' was prevalent in patients from across East Asia. Type B, however, didn't travel much beyond the region without further mutations -- implying a "founder event" in Wuhan, or "resistance" against this type of COVID-19 outside East Asia.

  • Type 'C' - The type found in the European population; found in early patients from France, Italy, Sweden, and England. It was absent from the study's Chinese mainland sample but seen in Singapore, Hong Kong, and South Korea.

The phylogenetic network analysis traced established infection routes: the mutations and viral lineages connected the dots between known cases. As per the researchers, "phylogenetic" methods can help predict future global hot spots of disease transmission and surge.

Genomic Study Points to Natural Origin of COVID-19

There have been some outrageous claims that the new coronavirus causing the pandemic was engineered in a lab and deliberately released to make people sick. However, a new study published in the journal Nature Medicine debunks such claims by providing scientific claims that the novel coronavirus arose naturally. The researchers used bioinformatics tools to compare publicly available genomic data from several coronaviruses, including the novel one that causes COVID-19.

The researchers analyzed parts of the virus genomes that encode the spike proteins. Spike proteins are responsible for the distinctive crown-like appearance of coronavirus, and the coronavirus needs the spike protein to infect a cell. With time, each coronavirus has fabricated the spike proteins a little differently, and genome analysis can provide evolutionary clues about these modifications. Genome analysis of the spike protein has revealed some unique adaptations - one of these adaptations bestows the virus's unique ability to bind to angiotensin-converting enzyme (ACE2) on human cells.

Computer models speculate that the new coronavirus would not bind to ACE2 as well as the SARS virus. However, the researchers found that the spike protein of the new coronavirus bound far better than computer predictions, likely because of natural selection on ACE2 that enabled the virus to take advantage of a previously unidentified alternate binding site. As per the researchers, this aspect provides strong evidence that that deadly virus was not human-made in a lab - any bioengineer trying to design a coronavirus that threatened human health probably would never have chosen this particular conformation for a spike protein.

Further analysis showed that the backbone of the new coronavirus's genome most closely resembles that of a bat coronavirus; however, the region that binds ACE2 resembled a novel virus found in pangolins. As per the researchers, if the new coronavirus was manufactured in a lab, scientists most likely would have utilized the backbones of coronaviruses already known to cause severe diseases in humans. This fact provides additional evidence that the novel coronavirus certainly originated in nature.

Way Forward - The slow mutation rate of SARS-CoV-2 means that changes will emerge over the years

Since January, researchers have analyzed thousands of SARS-CoV-2 genomes and tracked mutations that have arisen; however, there is a lack of compelling evidence that the mutations have had a significant change in how the virus affects us.

Researchers have observed that the coronavirus is mutating relatively slowly compared to some other RNA viruses; this is because virus proteins act as proofreaders can correct some mistakes. However, over a period of time, viruses can evolve into new strains or lineages that are distinctly different from each other. In the future, the coronavirus may pick up some mutations that help it evade our immune systems.

Sequencing more genomes will uncover new avenues in the virus's history. Researchers are especially interested in studying mutations from such as Africa and South America, where only a few genomes have been sequenced.


  • Forster P, Forster L, Renfrew C, Forster M. Phylogenetic network analysis of SARS-CoV-2 genomes. Proc Natl Acad Sci U S A. 2020;117(17):9241‐9243.

  • Andersen KG, Rambaut A, Lipkin WI, Holmes EC, Garry RF. The proximal origin of SARS-CoV-2. Nat Med. 2020;26(4):450‐452.

  • Using whole genome sequencing to help combat COVID-19. University of Cambridge. Available at:

Related Articles

Biologicals |
Key Insights from the Genome Analysis of the Novel Coronavirus

Ever since the first reports of COVID-19 in Wuhan, China, there has been considerable discussion and "debates" on the origin of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) - the causative virus of COVID-19.

Biologicals |
Human Vaccines

Vaccines are one of the most beneficial and valuable disease prevention measures contributing to long-term health gains. Advancements in research have led to the development of novel vaccines and delivery technologies and this is has caused a paradigm shift in the way diseases are prevented and treated.

Contact Us

Please feel free to talk to us if you have any questions. We endeavour to answer within 24 hours.

Book a consultation

Contact us to book a consultation for your new product launch