Viruses with this mutation are both common and rapidly spreading around the globe. The peer reviewed version of the study appears January 25 in the journal Cell.
Investigators found that viruses carrying this mutation are similar to the wild-type virus in their virulence and ability to spread but can bind to the human angiotensin converting enzyme 2 (ACE2) receptor more strongly.
Importantly, researchers show that this mutation confers resistance to some individual’s serum antibodies and against many neutralizing monoclonal antibodies, including one that is part of a treatment authorized for emergency use by the U.S. Food and Drug Administration.
“This means that the virus has many ways to alter the immunodominant domain to evade immunity while retaining the ability to infect and cause disease,” says senior author Gyorgy Snell, Senior Director of Structural Biology at Vir Biotechnology. “A significant finding from this paper is the extent of variability found in the immunodominant receptor binding motif (RBM) on the spike protein.”
The N439K mutation was first detected in Scotland in March 2020 and since then, a second lineage (B.1.258) has independently emerged in other European countries, which, by January 2021, was detected in more than 30 countries across the globe.
The Cell study also reports the X-ray crystal structure of the N439K RBD. “Our structural analysis demonstrates that this new mutation introduces an additional interaction between the virus and the ACE2 receptor,” Snell says.
“A single amino acid change (asparagine to lysine) enables the formation of a new point of contact with the ACE2 receptor, in line with the measured two-fold increase in binding affinity. Therefore, the mutation both improves interaction with the viral receptor ACE2 and evades antibody-mediated immunity.”
Once researchers determined that the N439K mutation did not change virus replication, they studied whether it allowed evasion of antibody-mediated immunity by analyzing the binding of more than 440 polyclonal sera samples and more than 140 monoclonal antibodies from recovered patients.
One way around this problem, researchers say, could be the use of antibodies that target highly conserved sites on the RBD. “The virus is evolving on multiple fronts to try to evade the antibody response,” Snell says.
That’s only 0.4%—just the tip of the iceberg,” he says. “This underscores the need for broad surveillance, a detailed understanding of the molecular mechanisms of the mutations, and for the development of therapies with a high barrier to resistance against variants circulating today and those that will emerge in the future.”
The SARS-CoV-2 virus, a novel member of the Betacoronavirus genus in the Coronaviridae family, is a positive-sense, single-stranded RNA virus, with approximately 29,900 nucleotides in its entire genome [1]. Structural studies and biochemical experiments have revealed many notable features of SARS-CoV-2.
For example, several vital proteins are key factors at different stages of the viral infectious cycle (Fig. 1 ), including virus entry, replication, survival, and further infection: 1) the spike (S) protein of SARS-CoV-2 appears to have a high affinity to human receptor angiotensin-converting enzyme 2 (ACE2), which facilitates viral entry to human cells; 2) the RNA-dependent RNA polymerase (RdRp) catalyzes RNA-template dependent RNA synthesis, which is responsible for virus replication; and 3) the main protease (Mpro) digests polyproteins that are translated from viral RNA, resulting in individual functional viral proteins. These features help us better understand the biology, immunology, and pathology of the virus, which make these key proteins actionable drug targets.
As the central dogma of molecular biology, RNAs primarily convert coding information from DNA into proteins; however, with many noncoding characteristics, RNAs are a group of essential molecules involved in almost all biological processes. RNA molecules and their biological features are used for diagnosis (e.g. testing kits), prevention (e.g. vaccines), and treatment (e.g. therapeutics) of human diseases.
Particularly, as emerging drug targets, RNAs have been developed for innovative and promising therapeutics. RNA therapeutics refer to RNA-based therapeutic technologies and delivery strategies, including 1) biological interventions through RNA molecules, such as small interfering RNAs (siRNAs), microRNAs (miRNAs), noncoding RNAs, antisense oligonucleotides (ASOs), guide RNAs, mRNAs, and other types of RNAs; and 2) biochemical approaches involving mRNA degradation, inhibition of RNA or protein synthesis, and blockage of cellular functions. Therefore, most therapeutics for the treatment of COVID-19 can be considered RNA therapeutics.
It is worth noting that the incidence of nucleotide mutations is significantly high among RNA viruses, due to the lack of proofreading capability of their RNA polymerases. Specifically, the mutation rate of SARS-CoV-2 could be hundreds of thousands of times higher than that in host cells, which benefits viral evolution and hinders disease prevention and treatment. This review aims to address the importance of monitoring mutations in SARS-CoV-2 and their impacts on RNA therapeutics used for the treatment of COVID-19.
Genome structure of SARS-CoV-2 and key drug targets
With ~29,900 nucleotides in the positive strand of its RNA, SARS-CoV-2 has 12 fictional open reading frames (ORFs), by which 16 proteins are encoded [2]. Some of these proteins are expressed as polyproteins that require protease cleavage to produce functional individual proteins. Specifically, 16 non-structural proteins (NSPs) are composed of 7096 amino acids.
From its 5′-cap to 3′-tail, the viral genome is arranged primarily by ORF1a and ORF1ab for NSPs, followed by other ORFs for structural proteins, including S, ORF3a, E, M, ORF6, ORF7a, ORF7b, ORF8, N, and ORF10 [1]. Briefly, the SARS-CoV-2 coded proteins can be categorized as replicase (R), protease (Pro), spike protein (S), membrane protein (M), nucleoproteins (N) including RdRp, and envelop protein (E). These proteins are important players in viral entry, replication, and survival in host cells, making them the major targets for therapeutics and vaccines [1].
The spike (S) protein
The coronavirus S protein is a transmembrane glycoprotein that is usually arranged as a trimer, and is the main protein that contributes to viral particle attachment, viral-host fusion, and virus entry into host cells [3], [4]. The monomer of the S protein is approximately 180 kD (1273 amino acids), containing an S1 subunit (671 amino acids) and an S2 subunit (587 amino acids) [5].
The S1 subunit contains two major domains, the N-terminal domain (NTD) and the C-terminal domain (CTD). At the CTD, the receptor-binding domain (RBD) is responsible for the attachment of a viral particle to the surface of host cells. The interacting counterpart molecule on the host cell surface is ACE2. ACE2 is a membrane protein, which primarily functions as a hormone that regulates blood vessel constriction and blood pressure; however, it works as a receptor protein in the host cell surface during coronavirus infection. Upon binding to the receptor ACE2, the polyprotein S undergoes conformational changes, which triggers cleavage of the S protein by the serine protease TMPRSS2 at the S1/S2 site to produce S1 and S2 subunits. The S2 subunit then undergoes a series of conformational changes, resulting in viral-host cellular fusion, followed by viral entry into host cells [6], [7], [8].
Biochemical characteristics of the S protein suggest that the S protein and its receptor protein ACE2 are valuable targets for developing therapeutics to block SARS-CoV-2 from integrating with host cells. Indeed, a battery of neutralizing antibodies (nAbs) and small molecules working as inhibitors has been investigated.
Several nAbs have been cloned or isolated with binding affinities towards the S protein and/or ACE2 receptor. As summarized by Huang et al. [5], antibodies 4A8 (binding to NTD), 47D11 (binding to S1), n3130 (binding to RBD), n3088 (binding to both S1 and RBD), S309 (inhibiting ACE2/RBD binding), 311mab-31B5 (binding to RBD), 311mab-31D4 (biding to RBD), P2C-1F11 (biding to RBD), P2B-2F6 (binding to RBD), B38 (binding to RBD) and H4 (binding to RBD) are idea candidates for neutralizing the coronavirus.
It is worth noting that pharmaceutical companies, such as Regeneron Pharmaceuticals and Eli Lilly and Company, have developed antibodies against the SARS-CoV-2 S protein to treat COVID-19, for which we will provide more detail later in the article. In addition, small molecules have been screened to determine the possibility of their interaction with the S protein.
Although some compounds showed a high binding affinity with the S protein, such as rescinnamine, iloprost, and prazosin (drugs for the treatment of hypertension); sulfasalazine, azlocillin, penicillin, and cefsulodin (drugs for anti-bacterial infection); and posaconazole and itraconazole (drugs for anti-fungal infection), only a few compounds were able to interfere with the binding between the S protein and ACE2 receptor.
For example, hesperidin was predicted to be able of preventing the interaction between RBD of the S protein and ACE2, thus inhibiting viral entry [9]. In SARS-CoV, the heptad repeat 1 domain (HR1) and heptad repeat 2 (HR2) domain of the SARS-CoV S2 subunit work together to facilitate virus–cell fusion and virus entry to host cells [10].
Owing to the high similarity in amino acid sequences between SARS-CoV and SARS-CoV-2, peptides were designed and developed to block the interaction of HR1/HR2 in SARS-CoV-2 with host cell membrane protein. For example, a lipopeptide EK1C4 was developed to block SARS-CoV-2 S-mediated virus–cell fusion [11].
Likewise, the lipopeptide IPB02 was found to be a potent virus-cell fusion inhibitor to prevent SARS-CoV-2 entry [12]. Small molecule drugs were screened to inhibit virus-cell fusion. For example, Viracept (nelfinavir mesylate), the only anti-HIV (human immunodeficiency virus) protease inhibitor approved by the U.S. Food and Drug Administration (FDA), is able to suppress S protein mediated fusion in SARS-CoV-2 infection with a high potency [13].
Proteolysis of the S protein at the S1/S2 and S2 sites by TMPRSS2, cathepsin B and L or furin is an essential step for SARS-CoV-2 virus entry. Protease inhibitors that can suppress the activities of TMPRSS2, cathepsin B and L, and furin are considered as potential drugs that block SAES-CoV-2 infection. Several inhibitors, such as camostat mesylate, E-64d, α-1-PDX, and hexa-D-arginine(D6R) are in this category [5].
The RNA-dependent RNA polymerase (RdRp)
The RdRp is an enzyme complex used by SARS-CoV-2 to replicate its genome and transcribe its viral genes. This complex contains three subunits which are encoded by three non-structural protein (nsp) genes, nsp12, nsp8, and nsp7 [14], [15]. The catalytic subunit nsp12 plays a critical role in viral RNA synthesis and transcription, while subunits nps8 and nps7 are cofactors responsible for de novo initiation and primer extension during RNA synthesis [16].
The nsp12 subunit possesses a nucleotidyltransferase domain in the N-terminus and an RdRp domain in the C-terminus [15], [17]. The binding site in the N-terminus of nsp12 is responsible for interacting with nucleoside triphosphate (NTP) substrate, in which amino acids N691, S682, and D623 are pivotal for recognizing the 2′–OH group of the NTP to distinguish RNA synthesis from DNA synthesis, resulting in the enzyme specificity of RdRp [14].
Briefly, upon binding to the RNA template and NTPs, the active site of nsp12 catalyzes the addition of a new nucleotide residue to the RNA chain, a step by which the newly synthesized viral RNA is extended. One copy of subunit nsp7 and one copy of nsp8 bind to different domains of nsp12 to form the RdRp complex while another copy of nsp8 may also provide polymerase activity.
In general, structural analysis has indicated that the subunit nsp12 is responsible for template recognition, nucleotide binding, and catalysis; and the heterodimer consisting of nsp7 and nsp8 serves as the stabilizer of the complex; while the second subunit of nsp8 may be critical for extending the RNA template-binding surface [15], [17].
Amino acid sequence analysis revealed that the RdRp does not show sequence homology with human analog proteins, which provides an opportunity for the development of anti-viral drugs to inhibit the enzyme with minimum risk of affecting hosts. In addition, because RdRp lacks proofreading endonuclease activity, low-fidelity RNA synthesis occurs during replication [18], opening a door for the development of chain terminators and nucleoside analog inhibitors as potential drugs to block viral replication through inhibition of RdRp.
RdRps in many viruses can be targeted by several drugs, including favipiravir, ribavirin, sofosbuvir, baloxavir, dasabuvir, remdesivir, galidesivir, pimorivir and beclabuvir [19]. Advances in structural analysis of SARS-CoV-2 RdRp are critical in facilitating the design of new drug candidates, while molecular modeling is a powerful tool to predict potential inhibitors of the enzyme.
The recently published crystal structure of SARS-CoV-2 RdRp [14], [15] has been used to screen potential inhibitors as drugs for the treatment of COVID-19 using in silico approaches. Molecular docking is an important approach to evaluate potential interactions between molecules. Molecular data on structures and functional mechanisms can provide insights into conformational changes and molecular interactions based on the alteration of energy and binding affinities, which increase the possibility of identifying a battery of real compounds that can inhibit the activity of this protein. Using this comprehensive computational approach, Parvez et al predicted a group of drugs that may potentially inhibit the activity of SARS-CoV-2 RdRp, and revealed amino acids involved in the interaction between drugs and the RdRp protein [20].
Ranked by their binding affinity (from the highest to the lowest) and binding energy (from the lowest to the highest), these drugs are listed as rifabutin, rifapentine, fidaxomicin, 7-methyl-guanosine-5′-triphosphate-5′-guanosine, ivermectin, remdesivir, and favipiravir, of which many are U.S. FDA-approved drugs [20].
Approved by the Japanese and Chinese governments and evaluated by a Phase-3 clinical trial in the United States, favipiravir is a guanine analog targeting RdRp to treat influenza virus infection. With a high specificity (no interaction with human DNA polymerase and RNA transcriptase), its metabolite ribofuranosyl-5′-triphosphate (F-RTP) is a suicide substrate that inhibits RdRp enzyme activity, thus terminating viral RNA synthesis.
Owing to a relatively high conservation of catalytic domains of RdRps among RNA viruses, it is speculated that favipiravir should have a broad spectrum of activity in inhibition of RNA virus replication; therefore, clinical trials to evaluate safety and efficacy of favipiravir in the treatment of COVID-19 patents have been conducted or are ongoing in multiple countries including China, Japan, the United States, Russia, Saudi Arabia, and India [21]. The preliminary conclusion is that favipiravir may be a potential drug to treat mild to moderate cases of COVID-19; however, larger and more randomized clinical trials are warranted to ensure its effectiveness and safety [21]. It is worth noting that remdesivir is the only drug approved by the U.S. FDA to treat COVID-19 patients, which will be discussed in detail later in the article.
The main protease (Mpro)
After translation, the two overlapping polyproteins (pp1a and pp1ab) encoded by the ORF1ab of SARS-CoV-2 are digested through a complicated multistep process by the Mpro (nsp5), also known as 3-chymotrypsin-like cysteine protease (3CLpro), resulting in 16 functional proteins, in which the Mpro has an autocleavage property that “frees” itself (nsp5) from nsp4 and nsp6.
While the monomer of Mpro is considered inactive, two subunits of Mpro, each with three domains, work together as a homodimer to catalyze the hydrolysis of polypeptides by an acid-base mechanism. The Mpro possesses a unique dyad (Cys145 and His41) in the active site, which is critical for the enzyme to maintain its substrate specificity.
The substrate specificity of the protease is relatively high: a glutamine must be present in the P1 position and a leucine residue is preferred in the P2 position (other hydrophobic residues are tolerated), for which the substrate is arranged as -P4-P3-P2-P1↓P1′-P2′-P3′- (from the N terminus to the C terminus). No homologous protein has been found in the human genome, limiting off-target effects introduced by inhibition of the enzyme; genetic conservation among various coronaviruses is relatively high, and crystal structures of the main proteases exhibited a high similarity among SARS-CoV, MERS-CoV and SARS-CoV-2, rendering similar substrate specificity among different coronaviruses [22], [23]. These features make Mpro an ideal antiviral target and facilitate drug repurposing based on previous efforts towards developing anti-coronavirus drugs.
Information of the biochemical characteristics and X-ray crystal structure of Mpro is utilized to design new drugs against SARS-CoV-2. In one study, two compounds (11a and 11b) targeting Mpro were designed and synthesized, which demonstrated an excellent inhibitory effect on Mpro enzyme activity and a significant inhibition potent towards viral infection by SARS-CoV-2.
The interaction of each compound with the active site of Mpro was confirmed by X-ray crystal structures at a very high resolution (1.5 Ã…), while pharmacokinetic properties of both compounds showed low toxicity, demonstrating that these promising compounds have a great potential to be anti-SARS-CoV-2 drugs [23].
Similarly, with a series of synthesized Mpro inhibitors, Rathnayake et al. [24] demonstrated that two optimized compounds (6j and 6 h) showed effective inhibitory activity against several coronaviruses including SARS-CoV-2, using enzyme assay and cell-based assays. Furthermore, these compounds exhibited significant inhibition against MERS-CoV infection in a mouse model. These data suggested that Mpro inhibitors could be developed to drugs to treat patients infected by human coronaviruses.
On the other hand, many compounds have been identified that can potentially inhibit the catalytic activity of Mpro, using in silico screening, virtual drug repurposing, and wet-lab approaches. Utilizing deep learning technology, Ton et al developed a deep docking platform to provide rapid prediction of docking scores, which enabled the analysis of 1.3 billion compounds that led to the identification of 1,000 highly possible inhibitors of Mpro protein [25].
Based on analyses of structure characteristics of Mpro and the free energy of binding between the active site and potential targeting compounds, in silico approaches including molecular docking can provide insights that lead to discovering new drug candidates or repurposing preexisting drugs. In silico studies suggested that many compounds exhibit inhibitory effects against Mpro, and that these compounds include flavonoids, peptides, terpenes, quinolines, nucleoside or nucleotide analogues, conventional protease inhibitors, phenalene, antibiotic derivatives, imidazole, and indoles [26].
Biochemical high throughput screening (HTS) strategies have also been applied to screen synthetic or natural compounds. For example, a pioneer study was conducted by Jin et al. [27], in which they integrated structure-assisted drug design, virtual drug screening and HTS to identify drugs that may target SARS-CoV-2 Mpro.
First, they found a mechanism-based inhibitor (N3) of Mpro through computer-assisted drug design, and then revealed the crystal structure of a complex in which Mpro and N3 interact. Using structure-based virtual screening in combination with biochemical high-throughput screening, they analyzed more than 10,000 compounds, including drugs, drug candidates and other compounds. Finally, they discovered that six compounds can inhibit Mpro significantly with IC50 values between 0.67 and 21.4 μM, of which the compound ebselen showed strong antiviral activity in cell-based assays [27].
In another study, with a recombinant Mpro, Coelho et al conducted HTS using a fluorescent assay to identify potential inhibitors from drugs, small molecules, and natural products, leading to the discovery of 13 Mpro inhibitors (e.g. thimerosal, phenylmercuric acetate, and benzophenone derivatives) with a low IC50 values (0.2–23 μM) [28].
Genetic mutations of SARS-CoV-2
The limited proofreading activity of SARS-CoV-2 RdRp introduces a substantial mutation rate (approximately 10−4) in the process of virus replication; some of these mutations will be “selected” in the progeny of the viral population due to pressure from the host immunological system and other environmental contributors [29], resulting in an accumulation of two mutants per month in the genome of SARS-CoV-2 [30].
After analyzing sequence data from 7666 viral genomes, Dorp et al suggested that mutations were very diversified and occurred in all countries – “everything is everywhere”, although the main phylogenetic clades were formed by virus strains sampled from the same geological area [31].
Cumulatively, as of September 2020, more than 10,000 single mutations in the SARS-CoV-2 genome have been recorded, in comparison to the first reported sequence reference published on January 5, 2020 [32], although many of the same mutations may be repeated in different virus isolates.
Even though the majority of viral mutations is harmless with minimal impact on viral virulence, some of these mutations may change infectivity, survival capability, pathologic property, or immunogenicity and antigenicity of the virus [30], [32]. For example, the common mutation D614G on the S protein may increase binding affinity between the S protein and host ACE2 receptor, thus enhancing virus loads which lead to increased infectivity.
Although the mutation does not change the ability of antibody recognition, the G-form variant is more readily neutralized by a human antibody [33]. By investigating 80 mutants and 26 modifications of the glycosylation site on the S protein, Li et al. [34] found that several mutants, including A475V, L452R, V483A, and F490L increased their resistance to neutralizing antibodies and deletion of glycosylation sites N331 and N343 decreased viral infectivity.
Notably, the N234Q mutant was significantly resistant towards neutralization by antibodies, whereas the N165Q mutant increased sensitivity towards neutralization therapeutics. Likewise, mutations were identified in the SARS-CoV-2 RdRp. In an earlier study with limited viral sequence data, Pachetti found several mutations in RdRp. Interestingly, it was reported that viral strains with RdRp mutations exhibited a mutation rate three times higher than that of viral strains without a RdRp mutation [35].
Using in silico approaches, Shannon et al. proposed that two mutants, F480L and V557L, may introduce structural and functional changes on RdRp, resulting in patient’ resistance against remdesivir therapy [36]. D722Y, a missense mutation located in the catalytic site of RdRp, along with mutations V472D and L469S, may change the hydrophobicity of the binding pocket of RdRp, leading to a negative impact on the efficacy of remdesivir [37].
Key amino acids in the active site of an enzyme are critical for the function of the enzyme. For example, the Gly-11 located in the dimer interface is essential for Mpro dimerization. The G11A mutation has been shown to completely prevent Mpro from dimerization, leading to an abolishment of Mpro activity of coronaviruses [38]. With many recorded mutations of Mpro, Amamuddy et al used homology modeling and molecular dynamic analyses to reveal the biological impacts of these mutants. The D48E variant introduced a conformational change at the substrate binding site, resulting in an alteration towards the efficiency and substrate specificity of Mpro [39].
Identification of gene mutation hotspots in the three key proteins
To identify mutation hotspots that are potentially important to COVID-19, we analyzed mutations on the SARS-CoV-2 genome by focusing on the five genes that encode the RdRp (consisting of Nsp12, Nsp7, Nsp8), S, and Mpro proteins. The SARS-CoV-2 reference genome, deposited by Wu et al. in January 2020, was downloaded from the National Center for Biotechnology Information (NCBI) database (under accession number NC_045512).
Complete SARS-CoV-2 genome sequences and their metadata were downloaded from the Gisaid EpiCovTM database https://db.cngb.org/gisaid/, which were accrued before Nov 19th, 2020. We filtered the genomes and selected those that were collected from May17th, 2020 to Nov 19th, 2020, resulting in a total of 102,478 genomes for subsequent analysis. Next, we disregarded sequences that contained “N” (indicating unidentified nucleotides), and thus retained 55,443 analysis-ready sequences for nsp12, 56,723 for nsp7, 56,391 for nsp8, 48,134 for S, and 54,897 for Mpro.
The number of analysis-ready sequences for each gene was then used to normalize mutation frequencies. The MAFFT program [40] was used to conduct multiple sequence alignments. Statistical analysis was conducted using Python with Pandas and Numpy for data cleaning and Biopython for sequence manipulation.
We found a combination of 3750 point-mutations for the five genes (Fig. 2 ). A total of 236 mutations occurred in more than 0.01% of valid sequences; 138 out of the 236 mutations resulted in amino acid changes. We confirmed the occurrence of mutations located at positions 23,403 and 14,408 which have been reported previously. A23403G that leads to D614G in the S protein has been shown to significantly increase the infectivity of SARS-CoV-2 virus by reducing S1 protein shedding [41], [42]. In our analysis, A23403G was present in 40,201 sequences for the S protein and showed the highest occurrence percentage (83.5%). Another highly frequent mutation is C14408T, which occurred in 80.4% of the sequences for nsp12 in our results.
We then used the “BLOSUM90” matrix [43] to select high-impact protein mutations. “BLOSUM90” is one of the substitution matrices used for amino acid sequence alignment; it measures how likely a certain amino acid substitution is to occur in a protein based on empirical data. “BLOSUM90” also predicts how significant the impact of an amino acid substitution is to a protein’s structure; a low “BLOSUM90” score indicates a great impact.
The thresholds we used to filter mutations were 1) at least 0.1% for occurrence percentage, 2) causing amino acid changes, and 3) less than 0 for “BLOSUM90” scoring. Our results revealed high-occurrence mutations that have not been previously reported, such as G10097A for Mpro (G15S, 3.02%, 1659 sequences), C11916T for nsp7 (S25L, 0.96%, 546 sequences), C13730T for nsp12 (A97V, 0.67%, 369 sequences), and G24368T for the S protein (D936Y, 0.2%, 94 sequences).
Based on “BLOSUM90” scoring, these non-synonymous mutations which cause amino acid changes may significantly affect protein structures, and thus potentially influence the binding affinity of drugs and their target viral proteins. We next assessed the 27 high-ranking mutations, of which 12 occurred in nsp12, 2 in nsp7, 10 in S, and 3 in Mpro. No high-ranking mutation was found in nsp8, even though its protein has a larger sequence (198 aa) than nsp7 (83 aa). This suggests that nsp8 may be more stable than the other four viral genes.
Identify mutation hotspot patterns in different geographic areas
To analyze the mutation hotspot patterns in different geographic areas, we divided our dataset based on six geographic areas: North America, Oceania, Europe, Asia, Africa, and South America. Some mutations showed clearly different patterns for different geographic areas. The G15598A (nsp12, V720I), G13993T (nsp12, A185S), G15766T (nsp12, V776L), and G14202T (nsp12, E254D) mutations only occurred in Europe, but not in other areas (at the time of our analysis). In addition, G15406T (nsp12, A656T), G12067T (nsp7, M75I), C22227T (S, A222V), A22879C (S, N439K), and five other mutations showed an occurrence percentage of >90% in Europe, while T14222G (nsp12, L261C) and C10319T (Mpro, L89F) occurred in >90% of sequences from North America.
We then normalized mutation occurrence over the total number of mutations from each geographic area (Europe: 58602, North America: 22175, Oceania: 15778, Asia: 4409, Africa: 720, and South America: 795). G15594T (nsp12, K718N) and G22992A (S, S477N) showed much higher occurrence in Oceania, whereas C13730T (nsp12, A97V) was more likely to occur in Asia. No prevalent mutations were found in Africa and South America due to the relatively small numbers of mutations detected.
RNA-based therapeutics for infectious diseases
Efforts to develop RNA therapies were initiated in the 1990s. Recent advances in RNA biology have broadened the scope of therapeutic designs and actionable drug targets for human diseases. RNA therapies can be categorized based on their mechanisms of activity:
- those that target RNAs (e.g., ASOs, siRNAs, and miRNAs),
- those that bind to proteins (e.g., aptamers),
- those that encode proteins (e.g., mRNAs),
- catalytically active RNA molecules (e.g. ribozymes).
Therapeutic agents that target RNAs include ASOs, siRNAs, and short hairpin RNAs (shRNAs). ASOs are single-stranded oligonucleotides (13–25 nt long); they can target mRNAs to modulate their splicing, degradation, and translation, or inactivate miRNAs [e.g. miravirsen for hepatitis C virus (HCV) infection] [44], [45], [46], [47].
siRNAs are natural, double-stranded RNA molecules, whereas shRNAs are artificial with a hairpin structure that can give rise to siRNAs; both siRNAs and shRNAs regulate post-transcriptional gene silencing through RNAi (RNA interference) (e.g. patisiran for amyloidosis treatment) [48], [49].
RNAi-based therapies for infectious diseases gained momentum in the early 2000s. In 2002, McCaffrey et al effectively targeted a sequence from HCV via RNAi in mice, demonstrating the potential of RNAi therapy in silencing viruses [50]. In 2003, Song et al suppressed HIV replication in primary macrophages by targeting CCR5, a major HIV-1 coreceptor, and p24, a viral structural gene, via siRNA-mediated RNAi [51].
The most recent rising stars in RNA therapies are miRNAs, which also modulate gene silencing through RNAi. miRNAs may serve as both therapeutic agents and targets [52]. For example, miravirsen, an investigational ASO drug, treats HCV patients by targeting and functionally inhibiting host miR-122 in the liver to reduce HCV RNA levels without causing viral resistance [47].
RNA aptamers are single-stranded RNA oligonucleotides; they can bind to their cognate targets with high specificity and affinity to modulate their functions [53]. While other RNA-based therapeutics target intracellular components, RNA aptamers are able to target both intracellular and extracellular targets [54].
In addition, by binding to cell surface proteins, RNA aptamers serve as a powerful tool for targeted delivery of conjugated therapeutic agents (e.g., siRNAs and small molecules) into specific cell types [55]. RNA aptamers may exert antiviral functions in various mechanisms at different stages of the viral lifecycle [56].
First, aptamers can inhibit virus entry into host cells. For example, RNA aptamer AS1411 targets nucleolin, a cellular protein involved in viral entry, which disrupts the binding of the dengue virus [57]. Second, aptamers can inhibit viral replication. For instance, a 24-mer RNA aptamer targets the Japanese encephalitis virus (JEV) methyltransferase to suppress methylation activity that is required for JEV production in host cells [58].
Another mechanism is through aptamer-mediated delivery in chimera therapies. For example, an aptamer-siRNA chimera specifically targets the envelope glycoprotein GP120 of HIV-1 to inhibit viral replication; the aptamer transfers the siRNA to reach HIV-infected cells and mediate GP120 suppression [59]. RNA aptamers show immense potential in treating infectious diseases considering their broad targeting ability and versatile mechanisms of action.
mRNA-based therapies are a transformative platform for prevention and intervention of infectious diseases. mRNAs from in vitro transcription (IVT) have been developed for protein replacement therapies and vaccines for cancer and infectious diseases since the 1990s [60].
Synthetic mRNAs resemble natural mRNAs and can be delivered in vivo (e.g. direct injection) and ex vivo (e.g. transfection and electroporation of human cells) for protein replacement or supplementation purposes. mRNA vaccines are designed to deliver the sequences of chosen antigens from pathogens for antigen expression in host cells.
A proof of concept for RNA-based vaccines emerged in 1993 with the injection of an mRNA from the influenza virus into mice, which induced a virus-specific T cell response [61]. There are two major types of mRNA vaccines: conventional and self-amplifying.
Conventional mRNAs encode viral antigens and are immediately translated upon cell entry and endosomal release; the antigen levels rely on successfully delivered conventional transcripts during vaccination [62]. In self-amplifying mRNAs, the sequences of viral antigens and NSPs enable intracellular RNA amplification which results in high levels of antigens [63]. Extensive investigation has been conducted for mRNA vaccines, leading to numerous mRNA vaccines under preclinical and clinical developments. Several excellent reviews have detailed mRNA vaccines for infectious diseases [62], [63], [64], [65].
Ribozymes – RNA enzymes – are another group of RNA agents under exploration for gene therapy of various diseases. For infectious diseases, ribozymes are particularly suitable for inhibiting replication of RNA viruses including HIV-1, HCV, and hepatitis B virus [66], [67], [68], [69], [70], [71], [72].
For examples, hammerhead and hairpin ribozymes have been shown to impede HIV-1 infection in cell culture. The major cleavage sites for ribozymes reside in the 5′ leader, the packaging sequence (Ψ) and the env, gag and tat genes of HIV-1; most of these sites show little evidence of natural variation among HIV-1 strains [73], [74], [75].
RNA therapies show promising potential with their unique advantages but also face great challenges. RNA agents are fast acting compared to DNA therapeutics that need to enter the nucleus for transcription. RNA therapies are reversible due to natural degradation. Also, there is little risk of exogenous RNAs incorporating into the host genome and creating host genome instability. Lastly, RNA manufacturing is rapid, cost-efficient and easily scalable.
The two main obstacles that hamper clinical application of RNA therapies are stability and delivery [76]. Unmodified RNAs are notoriously prone to rapid degradation and clearance from circulation, which reduces their bioavailability and half-life. Delivery of RNA agents to the right action sites of target cells with high enough concentration is critical to therapeutic efficacy.
Although RNAs may diffuse into or be actively endocytosed by host cells, the cell entry efficiency may be significantly low, partly due to the negative charge of RNAs and cell membranes. Additionally, activation of innate immune nucleic acid sensors and off-target effects remain to be addressed to improve the safety of RNA therapies [77]. Efforts on overcoming these issues will propel the development and qualification of RNA therapeutics.
Promising RNA therapeutics targeting the three key proteins
Several promising RNA therapeutics have been completed or are in ongoing clinical trials, of which a few were approved or under Emergency Use Authorization (EAU) by the U.S. FDA. Remdesivir has drawn great attention as it is the first and only small molecule drug approved by the U.S. FDA to specifically treat COVID-19 patients. As an adenosine analog, remdesivir was initially developed to treat infection of another coronavirus, Ebola [78], [79], [80].
Remdesivir binds to the binding pocket of RdRp and terminates RdRp-dependent viral RNA synthesis [81], [82]. In vitro experiments showed that remdesivir was highly effective in inhibiting Ebola viral replication in several human cell lines, and in vivo studies proved that remdesivir significantly diminished Ebola virus in nonhuman primates [19].
Likewise, its effectiveness towards inhibition of SARS-CoV-2 was supported by results from in vitro experiments and preclinical in vivo animal studies [83]. Upon entry into host cells, the prodrug remdesivir is converted to remdesivir triphosphate (TP), an active form of the metabolite, which is a competitive substrate of adenosine triphosphate (ATP) for the viral RdRp.
The competition between remdesivir-TP and ATP through RdRp activity causes a chain termination in synthesizing nascent RNA strand, thus inhibiting virus replication [84].
The mechanism of remdesivir in interacting with coronavirus RdRp and its inhibitory effects from both in vivo and in vitro experiments inspired many finished or ongoing clinical trials to test the safety and efficacy of remdesivir in the treatment of COVID-19 patients.
For example, a randomized clinical trial with 596 patients reported that the administration of remdesivir could benefit severe COVID-19 patients, compared to patients with placebo [85]. In another randomized, open-label multi-center clinical trial with 394 participants, researchers found that both 5-day and 10-day courses of remdesivir treatment can improve symptoms of severe COVID-19 patients although there was no significant difference between 5-day and 10-day treatments [86].
Importantly, a double-blind, randomized, placebo-controlled trial (ACTT-1) was conducted with 1062 patients, of which 541 patients were treated with remdesivir while 521 patients were assigned a placebo. The ACTT-1 Study Group stated in its final reports that “remdesivir was superior to placebo in shortening the time to recovery in adults who were hospitalized with Covid-19 and had evidence of lower respiratory tract infection” [87]. On October 22, 2020, the U.S. FDA approved remdesivir (brand name Veklury) as the first treatment for hospitalized COVID-19 patients that are 12 years of age or older and weigh >40 kg [88].
RNA-based therapies targeting the S proteins have achieved great accomplishments as well. Moderna Inc.’s mRNA-1273 is an mRNA-based vaccine that encodes the S protein of SARS-CoV-2 and is delivered by lipid nanoparticles (LPN); the nucleoside-modified mRNA stabilizes the S protein in its prefusion conformation, which prevents the virus from entering host cells [89]. Vaccination with mRNA-1273 has shown neutralizing activities in nonhuman primates and human subjects in a Phase I study with no trial-limiting safety concerns [89], [90].
As of Oct 22, 2020, Moderna Inc. has completed enrollment of 30,000 participants for a Phase III clinical trial of mRNA-1273 [91]. Moderna Inc. announced with an early data release on Nov 16, 2020 that the mRNA-1273 vaccine candidate showed an efficacy of 94.5% (p < 0.0001) [92]. Another mRNA-based vaccine candidate currently in Phase III clinical testing is BNT162b2 from Pfizer and BioNTech. BNT162b2 is engineered similarly to mRNA-1273 (i.e. nucleoside modification and LPN encapsulation) and operates under the same mechanism by capturing the S protein for perfusion conformation [93].
BNT162b2 vaccination in mice and rhesus macaques both induced strong T-cell responses and fully protected the lungs of rhesus macaques from SARS-CoV-2 infection challenge. In a press release on Nov 18, 2020, Pfizer reported that BNT162b2 was 95% effective in their completed Phase III study [94].
Additionally, Regeneron Pharmaceuticals Inc. characterized a collection of antibodies against the SARS-CoV-2 S protein using sera from humanized mice and convalescent patients, and selected a pair of antibodies (R10933 and R10987) with high potency to bind the RBD and block viral entry to host cells, resulting in a potential therapeutic for the treatment of COVID-19 [95], [96]. Currently, an ongoing randomized, double-blind clinical trial has shown that Regeneron’s cocktail can decrease viral load and improve symptoms of patients, suggesting that the cocktail could be an effective strategy in the standard-of-care for COVID-19 patients. LY-CoV555 is an S protein antibody developed by Eli Lilly and Company, which exhibited neutralizing property towards SARS-CoV-2 and resulted in the protection of non-human primates from SARS-CoV-2 infection [97].
Furthermore, preliminary data analysis indicated that the combination therapy of LY-CoV555 and LY-CoV016 (namely bamlanivimab and etesevimab) decreased viral load, improved symptoms of COVID-19 patients, and reduced COVID-related hospitalization and emergency room visits, in a randomized, double-blinded clinical trial BLAZE-1 (NCT04427501) (https://investor.lilly.com/news-releases/news-release-details/lilly-provides-comprehensive-update-progress-sars-cov-2). As of Nov 21, 2020, the U.S. FDA has granted Emergency Use Authorizations for both bamlanivimab and casirivimab-imdevimab cocktails to treat COVID-19 patients who are 12 years of age or older, more than 40 kg of weight, and at high risk for progressing to severe COVID-19 [98], [99].
The main proteinase The Mpro plays an important role in viral replication and is another primary target for the development of antiviral therapy against COVID-19 [22]. The Mpro plays an important role in viral replication and is another primary target for the development of antiviral therapy against COVID-19 [22]. The Mpro of SARS-CoV-2 shares high levels of similarity in sequence and structure with other betacoronaviruses; thus, previously developed compounds that target the main proteases of other coronaviruses may be repurposed for combating the current pandemic [22].
Several compounds that inhibited the main proteases of SARS-CoV or MERS-CoV have been shown to co-crystallize with that of SARS-CoV-2 [23], [27], [100]. Recently, boceprevir, GC-376, and calpain inhibitors II and XII were identified to inhibit SARS-CoV-2 viral production by targeting the Mpro with high potency [101]. Specifically, these inhibitors showed EC50 values ranging from 0.49 to 3.37 µM in suppressing viral production in cell culture. In addition, lopinavir and ritonavir are two HIV-1 proteinase inhibitors that are being tested in clinical trials for treatment of COVID-19.
In a randomized, controlled, open-label trial with a 199 subjects, lopinavir–Ritonavir treatment showed no benefit beyond standard care to hospitalized adult patients with severe COVID-19 [102]. An open-label, randomized, Phase 2 trial tested a triple combination of interferon beta-1b, lopinavir–ritonavir, and ribavirin for the treatment of hospitalized adult patients with COVID-19 [103]. Early combination therapy was found to be safe and performed better than lopinavir–ritonavir alone in symptom alleviation and recovery time for patients with mild to moderate COVID-19 [103]. Early intervention and combinatory treatment may have individually or together helped achieve positive results in this latter trial. More investigation is needed to identify effective treatments against this appealing target.
The impact of mutations on drug safety and efficacy
Mutations in coronaviruses may reduce drug efficacy in several ways: 1) by decreasing the binding affinity of a drug-target complex; 2) by making the drug target constitutively active; and 3) by turning an antagonist drug to an agonist. The history of anti-HIV treatment has illustrated the impact of viral mutation rates on drug resistance [104].
The HIV-1 virus produces a possible single-nucleotide mutation per patient per day, which has led to resistance to the first approved anti-HIV drug, azidothymidine [105]. Similarly, multiple resistances have been observed for treatment of HCV, another fast-mutating RNA virus [5]. Moreover, resistance to protease and polymerase inhibitors exists in treatment-naïve patients, i.e. without selection that favors the mutations [106]. Combining multiple drugs, instead of increasing monotherapy potency, has proven to be a successful antiviral regimen that minimizes resistant mutation appearance and improves effectiveness [104].
The rapid infection of SARS-CoV-2 worldwide has drawn immense attention to its mutations and strains. Analysis of 220 SARS-CoV-2 genomic sequences derived from patients worldwide revealed different mutations in RdRp that emerged at different times (December 2019 – mid-Mach 2020) and locations, suggesting that SARS-CoV-2 is evolving and that multiple strains with their specific mutation patterns may coexist [35].
Among all newly identified mutations is P14408L, which resides near a hydrophobic cleft of RdRp and is associated with an overall increased mutation rate [35]. Previous studies have shown that naturally emerging mutations in RdRp may lead to drug resistance or reduce treatment efficacy due to weakened drug-RdRp binding, particularly near the mutation sites [35], [107], [108], [109]. In a recently published case report, a point mutation in RdRp (D484Y) was associated with the failure of remdesivir treatment in a 76-year-old female COVID-19 patient with post-rituximab B-cell immunodeficiency [110].
It is suspected that resistant mutations emerge under remdesivir treatment pressure which leads to treatment failure [110]. Therefore, it is crucial to identify resistant mutations and evaluate their impact on the conformation and activity of RdRp and its susceptibility to antiviral treatment. Meanwhile, new antiviral drugs for different targets along the viral lifecycle and cocktail therapies are planned for further testing to improve COVID-19 treatment.
Host-directed therapy design may resolve drug resistance stemming from fast-emerging viral mutations and strains. A recent study established an interaction map that consisted of 332 protein–protein interactions between SARS-CoV-2 and humans using affinity-purification mass spectrometry [111]. Through these interactions, the authors identified 66 host proteins or factors that were targeted by 69 approved or investigational compounds. The findings may facilitate drug repurposing for host-directed intervention towards mutation-associated drug resistance and pan-viral therapies as new treatment regimens.
reference link: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7816569/
More information: Cell, Thomson et al.: “Circulating SARS-CoV-2 spike N439K variants maintain fitness while evading antibody-mediated immunity” www.cell.com/cell/fulltext/S0092-8674(21)00080-5 , DOI: 10.1016/j.cell.2021.01.037