While there has been extensive focus on understanding the amino acid mutations in the Delta variant’s Spike protein, the mutational landscape of the rest of the SARS-CoV-2 proteome (25 proteins) remains poorly understood.
Many virologists and genomic specialists have been over focusing on mutations in the Spike proteins of the SARS-CoV-2 coronavirus instead of also including mutations found on the Nucleocapsid (N) Proteins and on the rest of the SARS-CoV-2 genome.
Contrary to the ‘old school thoughts’, the whole genome and mutations found on the rest of the genome do play a very important part in not only the transmissibility, virulence and immune evading properties of the emerging variants but they also play a critical role in the different pathogenesis routes, in the ways the various host cellular pathways are affected and also in the manifestation of different clinical outcomes.
To make matters more complicated, the mutations also work in such a way that a particular combination of mutations, complement each other and produces different outcomes or attributes.
Also many studies are based on publicly available data from the GISAID database, which may carry biases associated with sequencing disparities across countries and reporting delays.
Though there is extensive genomic surveillance, there is a lack of clinical annotation of the genomes, limiting the ability to assess the clinical impact of the country-specific differences in the variants.
Furthermore, stupidly the GISAID database does not record mutations in the recently discovered ORFs in the SARS-CoV-2 genome such as ORF10, ORF9b, and ORF9c. The assignment of the mutations in these ORFs may reveal further differences between SARS-CoV-2 variants and also constitute critical data.
It seems that the world is developing vaccines, therapeutics and trying to understand the pathogenesis of the virus and also how it affects the various human host cellular pathways and immune system and also find ways to treat the various conditions when in the first place we do not even have a complete and full data about what we are dealing with! Researchers are only working with half-baked data so far!
I personally challenge anyone out there who claims that what I am mentioning here is not relevant or nonsense.
In a new study by researchers from the King Abdullah University of Science and Technology (KAUST)-Saudia Arabia , the Ministry of National Guard Health Affairs-Saudi Arabia and the MOH-Saudi Arabia have identified new mutations on the SARS-CoV-2 Nucleocapsid (N) protein that was found to modulate host interactions and also cause increased viral loads in COVID-19 patients.
The study team sequenced 892 SARS-CoV-2 genomes collected from patients in Saudi Arabia from March to August 2020. From the assem bled sequences, the team estimated the SARS-CoV-2 effective population size and infection rate and outline the epidemiological dynamics of import and transmission events during this period in Saudi Arabia.
Importantly the study findings showed that two consecutive mutations (R203K/G204R) in the SARS-CoV-2 nucleocapsid (N) protein are associated with higher viral loads in COVID-19 patients.
The study’s comparative biochemical analysis reveals that the mutant N protein displays enhanced viral RNA binding and differential interaction with key host proteins.
The study found hyper-phosphorylation of the adjacent serine site (S206) in the mutant N protein by mass-spectrometry analysis.
Alarmingly detailed analysis of the host cell transcriptome suggests that the mutant N protein results in dysregulated interferon response genes.
The study findings provide crucial information in linking the R203K/G204R mutations in the N protein as a major modulator of host-virus interactions and increased viral load and underline the potential of the nucleocapsid protein as a drug target during infection.
The study findings were published on a preprint server and are currently being peer-reviewed. https://www.medrxiv.org/content/10.1101/2021.05.06.21256706v2
Most importantly this is one of the few studies that demonstrate the need to pay attention to mutations in the other parts of the virus genome as these too play a critical role in the characteristics of the emerging variants bearing these mutations.
Seven of the isolates belonged to the A2a clade, while one belonged to the B4 clade. Specific mutations, characteristic of the A2a clade, were also detected, which included the P323L in RNA-dependent RNA polymerase and D614G in the Spike glycoprotein. Further, our data revealed emergence of novel subclones harbouring nonsynonymous mutations, viz. G1124V in Spike (S) protein, R203K, and G204R in the nucleocapsid (N) protein.
The N protein mutations reside in the SR-rich region involved in viral capsid formation and the S protein mutation is in the S2 domain, which is involved in triggering viral fusion with the host cell membrane. Interesting correlation was observed between these mutations and travel or contact history of COVID-19 positive cases.
Consequent alterations of miRNA binding and structure were also predicted for these mutations. More importantly, the possible implications of mutation D614G (in SD domain) and G1124V (in S2 subunit) on the structural stability of S protein have also been discussed. Results report for the first time a bird’s eye view on the accumulation of mutations in SARS-CoV-2 genome in Eastern India.
The list of mutations detected in the sequences from nine samples are provided (table 2). Seven sequences harboured the important signature mutations of A2a clade. These consisted of the 14408 C/T mutation resulting in a change of P323L in the RdRp and the 23403 A/G mutation resulting in a change of D614G in the Spike glycoprotein of the virus.
In addition to these, 24933 G/T mutation in the gene coding for Spike glycoprotein (G1124V) and triple base mutations of 2881-2883 GGG/AAC in the gene coding for nucleocapsid resulting in two consecutive amino acid changes R203K and G204R were detected in S2, S3 and S2, S3, S5 respectively. While the 24933 G/T S gene mutation was unique to these samples and could not be found in any other sequence from India or the rest of the World, the nucleocapsid mutations could be detected in only three other sequences from India (figure 2).
Out of these, two sequences were obtained from individuals with contact history of a COVID-19 patient who had travelled from Italy. Interestingly, two out of three sequences harbouring these mutations obtained by us belonged to Kolkata and with contact history with one COVID-19 patient who had travelled from London (UK). The third sequence was obtained from a COVID-19 patient from Darjeeling, India who had history of travel from Chennai, India. These mutations have been found in 16% of SARS-CoV-2 sequences reported World-wide from countries like UK, Netherlands, Iceland, Belgium, Portugal, USA, Australia, Brazil, etc.
List of mutations detected in the SARS CoV2 virus strains identified in West Bengal, India
|Nucleotide Position||Reference Base||Mutant Base||S1||S2||S3||S5||S6||S8||S10||S11||S12||Gene||Nature of Mutation|
|14408||C||T||–||Yes||Yes||Yes||Yes||Yes||–||Yes||Yes||RdRp||P323L, Clade Specific|
|23403||A||G||–||Yes||Yes||Yes||Yes||Yes||–||Yes||Yes||S||D614G, Clade Specific|
|26494||T||C||–||–||–||–||–||Yes||–||–||–||Junction of GU280_gp04 and GU280_gp05||Noncoding|
RdRp (NSP12) gene of the SARS-CoV-2 codes for the RNA-dependent RNA polymerase and is vital for the replication machinery of the virus. We detected a total of six mutations in this gene in the nine samples, out of which four were nonsynonymous, including the A2a clade specific 14408 C/T (RdRp: P323L) mutation. Two individuals, S11 and S12, harboured viral genome sequences that shared a unique 13730 C/T (A88V) mutation which was not found in any other sequence reported from India or rest of the World. One individual S10, whose viral sequence belonged to B4 clade, harboured 3 mutations in RdRp, which appear to be clade specific, out of which 2 were nonsynonymous.
reference link : https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7269891/