HIV replication in the human body requires that specific viral RNAs be packaged into progeny virus particles.
The study, published last week in the Proceedings of the National Academy of Sciences, found that HIV chooses its viral RNA genome – the “source code” that it injects into healthy human cells to infect them – based on functions attributable to just two nucleotides.
“It’s just this two-nucleotide difference that makes such a dramatic effect,” said Karin Musier-Forsyth, senior author of the study, Ohio Eminent Scholar and a professor of chemistry and biochemistry at The Ohio State University. “If we can prevent it from packaging its own genome, we can prevent it from spreading inside the body.”
“Just like we need a genome encoded by DNA, viruses have their own genomic DNA or RNA—in the case of HIV it’s RNA—and they have to package their genomic RNA and that’s what this whole study is about,” she said. “It’s an essential step for how we understand the replication of the virus.”
RNA is a string of nucleotides, and it is present in some form or another in all living things, including viruses. In HIV, it carries the genetic information that allows the virus to copy itself inside a host—the human body. HIV RNA comprises about 9,800 nucleotides.
“But the viral genome from HIV is made in small amounts, and it is very selectively packaged as genomic RNA, in addition to serving as mRNA to make viral proteins. How does the virus find this genomic RNA to package and not just package any old RNA in our cells?”
The researchers examined the structures of two nearly identical HIV RNA strings and found that the virus used a two-nucleotide difference on the very end of the RNA strings to distinguish between genomic RNA and viral mRNA. One, they found, was more efficient at being packaged as a genome than the other due to the conformations, or structures, that it formed.
The findings could have implications for future HIV treatments that target RNA and would be different from current HIV treatments, which primarily target viral proteins. New HIV drugs based on this discovery are likely years away, but Musier-Forsyth said this finding is an important scientific step.
“Now that we understand more about the structure of the RNA, we could develop therapeutics, whether they be small molecules or other new nucleic acid therapeutics, that could lock the RNA into a conformation that wouldn’t be packaged. If it can’t package its genome then it can’t replicate,” Musier-Forsyth said.
Other Ohio State researchers who contributed to this study include Shuohui Liu and Jonathan P. Kitzrow. This work was supported by the National Institutes of Health.
The HIV-1 genome (gRNA) is a single-stranded RNA molecule that encodes the essential structural polyproteins Gag, Gag-Pol and the envelope glycoprotein Env, together with a number of accessory factors that aid viral replication and immune evasion. gRNA thus serves as a template for the translation of the viral structural proteins found in Gag and the enzymes encoded by Pol, as well as being captured by Gag for packaging into virions.
During packaging, the gRNA undergoes dimerisation, resulting in two copies of the genome being encapsidated into the budding virions. During or after budding, the Gag and Gag-Pol polyproteins are cleaved into their individual components—matrix (MA), capsid (CA) and nucleocapsid (NC)—as well as three smaller peptides [1], and the enzymes reverse transcriptase (RT), integrase (IN) and protease (PR) by PR itself.
Upon infection of a new cell, RT initiates the reverse transcription of the ssRNA genome into dsDNA using a cellular tRNALys3 primer that anneals to the gRNA at some stage during viral assembly cellular tRNALys3 primer that anneals to the gRNA at some stage during viral assembly and budding. The chaperone activity of NC facilitates the annealing of tRNA to the primer binding site (PBS), and aids reverse transcription by destabilising the secondary structures that would cause the pausing or stalling of the enzyme [2–8]. IN then integrates the freshly synthesised proviral DNA into the newly infected host cell genome, from where it can be transcribed.
A critical step in this complex viral replication cycle is the recognition and packaging of the viral gRNA. The gRNA packaging process is highly specific and represents a novel drug target [9]. It has proven hard to study in structural detail due to the transient nature of the sequential steps involved, likely involving multiple different RNA structures. The recognition of the gRNA by Gag is dependent upon sites within the highly conserved 5t UTR [10–13].
This region consists of conserved hairpin/helical structures, including the trans-activation response element (TAR), a poly(A) sequence, the tRNA primer binding site (PBS) and the major packaging signal (Ψ) [14,15] (Figure 1). Ψ is a vital component of the dimerisation process, and is composed of three stem–loops (SL1–SL3). SL1 contains a palindromic dimer initiation site (DIS) and facilitates RNA dimerisation through an intermolecular kissing-loop interaction [16–22], SL2 contains the splice donor (SD) site and SL3 is a major determinant of gRNA encapsidation [23–25]. An additional stem–loop that spans the Gag start codon, SL4, has been proposed to regulate Gag translation by preventing interaction between the U5 region and the gag initiation codon by forming the U5:AUG helix [15,26–28].

It has been previously proposed that the shift from the translation of gag to gRNA dimerisation is facilitated by an RNA structural switch. There have been two predominant models for this. Firstly, a switch from a ‘Long Distance Interaction’ (LDI) conformation to a ‘Branched Multiple Hairpin’ (BMH) [16,27,29]. In the LDI conformation, the DIS is prevented from forming the kissing-loop interaction by being base-paired with the poly(A) element, resulting in the gag initiation codon being located within a less stable structure than within the BMH conformation, to facilitate translation. In the BMH conformation, the gag initiation codon is sequestered through base pairing with the U5 region (referred to as the U5:AUG interaction), releasing the DIS and allowing it to base pair with the DIS on a second gRNA [16,27,29,30].
Subsequent work has broadly confirmed the BMH model; however, mutants created to prevent the formation of the LDI conformation led to reduced dimerisation, but did not impact Gag translation [30]. More recent approaches using NMR and in-gel SHAPE suggest an alternative pseudoknot structure for the monomer [15,31].
In this structure, the DIS binds to a complementary site in the U5 region and SL4 forms; dimerisation then accompanies a switch from U5:DIS to U5:AUG (Figure 1). The precise transcriptional start site and the number of 5t Gs the transcript contains has also been proposed to control RNA structural changes [32–34].
The initiation of gRNA encapsidation is generally accepted to involve a small number of Gag proteins binding to Ψ [35]. The switch to the U5:AUG interaction from U5 being in an alternative intramolecular pairing frees the DIS sequence for intermolecular base pairing via a kissing-loop interaction and the formation of ‘loose dimers’ [15,17,21,29,36–39]. In association with the NC domain of the Gag protein, the RNA molecules refold to form a more stable ‘tight dimer’ or ‘extended duplex’ [40–43], the intermolecular extent of which may extend significantly beyond SL1 itself [44]. The ribonucleoprotein complex containing a small number of Gag proteins and the gRNA traffics to the plasma membrane where additional, exposed binding sites in the gRNA allow the recruitment of further Gag proteins to form the immature viral particle [42,45–48].
We previously demonstrated an in-gel SHAPE (selective 2’OH acylation analysed by primer extension) method that was able to resolve the structures of the monomeric and dimeric HIV-1 leader sequences, without the need for stabilising mutagenesis, that identified certain key structural changes involved in RNA dimerisation [31]. SHAPE reagents such as NMIA (N-methyl isatoic anhydride) covalently react with the 2’OH of nucleotides irrespective of base, directly proportionally to the flexibility of the nucleotide backbone at that position.
They therefore act as a marker of whether a nucleotide is single-stranded or base-paired. SHAPE data are used in conjunction with modelling software to derive a secondary structural model for the structure or range of structures (‘ensemble’) of the RNA. Using these reagents in a native gel matrix enables the separation and isolation of individual RNA conformers. Our previous use of this technique under native conditions demonstrated differences in NMIA reactivity within the U5, AUG and SL1 sequences that marginally favoured the pseudoknot model of the monomeric structure over other models [15].
Here, initial experiments on the well-established TAR–Tat interaction suggested that SHAPE reagents reliably report upon the structural flexibility of the backbone at each nucleotide without being strongly affected by the ‘footprint’ of the protein binding. How- ever, the structural ensemble of monomeric TAR RNA in the absence of Tat differs from the structural ensemble of the unshifted TAR that was incubated in the presence of Tat. Effectively, the technique appears to reveal the sub population of structures within the ensemble to which the protein did not bind.
We then used in-gel SHAPE to study the 5t region of the HIV-1 gRNA from the transcription start to within the beginning of the Gag open reading frame that contains the major sequences required for gRNA encapsidation. We sought to identify changes that occur in the structural ensembles of the HIV-1 RNA monomer and dimer RNA species upon the addition of Gag or NC during the gRNA dimerisation process.
We found that the monomeric ensemble in the absence of a ligand, as well as the structures with Gag bound, largely resemble the LDI model, with the DIS paired in a long- range interaction with the U5 region. The monomeric ensemble with NC bound was more heterogeneous but still contained many of the LDI features. Within the dimeric ensemble TAR, poly(A) and SL1 structures were frequently present, but the dimer in the absence of a ligand did not contain the U5:AUG helix; however, the shifted dimer did. Our results show the surprising diversity of the RNA structural ensembles that could potentially be formed, and how they differ when Gag or NC bind to the RNA. They also indicate the structures preferentially selected by Gag or NC, as well as how the proteins remodel the RNA.
XL-SHAPE was able to identify the initial interaction sites of Gag with the gRNA and show that these differ from those of NC at the same molar ratio of protein:RNA. Gag first interacts with the TAR region, and in doing so has structural effects on the downstream Ψ region structure. The interaction with NC alone is more promiscuous and is more reflective of how Gag interacts with the RNA when Gag is in higher concentrations. Our results suggest a mechanism by which HIV controls the switch between translating and packaging its genome, and provide insights into the RNA structures occurring during viral maturation. Interference with these critical structural transitions may have therapeutic potential.
REFERENCE LINK : https://doi.org/ 10.3390/v13122389
More information: Olga A. Nikolaitchik et al, Selective packaging of HIV-1 RNA genome is guided by the stability of 5′ untranslated region polyA stem, Proceedings of the National Academy of Sciences (2021). DOI: 10.1073/pnas.2114494118