Within two months, SARS-CoV-2, a previously unknown coronavirus, has raced around globe, infecting over a 100,000 people with numbers continuing to rise quickly.
Effective countermeasures require helpful tools to monitor viral spread and understand how the immune system responds to the virus.
Publishing in the March 16, 2020, online issue of Cell, Host and Microbe, a team of researchers at La Jolla Institute for Immunology, in collaboration with researchers at the J. Craig Venter Institute, provides the first analysis of potential targets for effective immune responses against the novel coronavirus.
When the immune system encounters a bacterium or a virus, it zeroes in on tiny molecular features, so called epitopes, which allow cells of the immune system to distinguish between closely related foreign invaders and focus their attack.
Having a complete map of viral epitopes and their immunogenicity is critical to researchers attempting to design new or improved vaccines to protect against COVID-19, the disease caused by SARS-CoV-2.
“Right now, we have limited information about which pieces of the virus elicit a solid human response,” says the study’s lead author Alessandro Sette, Dr. Biol.Sci, a professor in the Center for Infectious Disease and Vaccine Research at LJI.
“Knowing the immunogenicity of certain viral regions, or in other words, which parts of the virus the immune system reacts to and how strongly, is of immediate relevance for the design of promising vaccine candidates and their evaluation.”
While scientists currently know very little about how the human immune system responds to SARS-CoV-2, the immune response to other coronaviruses has been studied and a significant amount of epitope data is available.
Four other coronaviruses are currently circulating in the human population. They cause generally mild symptoms and together they are responsible for an estimated one quarter of all seasonal colds.
But every few years, a new coronavirus emerges that causes severe disease as was the case with SARS-CoV in 2003 and MERS-CoV in 2008, and now SARS-CoV-2.
“SARS-CoV-2 is most closely related to SARS-CoV, which also happens to be the best characterized coronavirus in terms of epitopes,” explains first author Alba Grifoni, Ph.D, a postdoctoral researcher in the Sette lab.
For their study, the authors used available data from the LJI-based Immune Epitope Database (IEDB), which contains over 600,000 known epitopes from some 3,600 different species, and the Virus Pathogen Resource (ViPR), a complementary repository of information about pathogenic viruses.
The team compiled known epitopes from SARS-CoV and mapped the corresponding regions to SARS-CoV-2.
“We were able to map back 10 B cell epitopes to the new coronavirus and because of the overall high sequence similarity between SARS-CoV and SARS-CoV-2, there is a high likelihood that the same regions that are immunodominant in SARS-CoV are also dominant in SARS-CoV-2 is,” says Grifoni.
Five of these regions were found in the spike glycoprotein, which forms the “crown” on the surface of the virus that gave coronaviruses their name; two in the membrane protein, which is embedded in the membrane that envelopes the protective protein shell around the viral genome and three in the nucleoprotein, which forms the shell.
This transmission electron microscope image shows SARS-CoV-2—also known as 2019-nCoV, the virus that causes COVID-19—isolated from a patient in the U.S. Virus particles are shown emerging from the surface of cells cultured in the lab. The spikes on the outer edge of the virus particles give coronaviruses their name, crown-like. The image is credited to NIAID-RML.
In a similar analysis, T cell epitopes were also mostly associated with the spike glycoprotein and nucleoprotein.
In a completely different approach, Grifoni used the epitope prediction algorithm hosted by the IEDB to predict linear B cell epitopes.
A recent study by scientists at the University of Texas Austin determined the three-dimensional structure of the spike proteins, which allowed the LJI team to take the protein’s spatial architecture into account when predicting epitopes. This approach confirmed two of the likely epitope regions they had predicted earlier.
To substantiate the SARS-CoV-2 T cell epitopes identified based on their homology to SARS-CoV, Grifoni compared them with epitopes pinpointed by the Tepitool resource in the IEDB. Using this approach, she was able verify 12 out of 17 SARS-CoV-2 T cell epitopes identified based on sequence similarities to SARS-CoV.
“The fact that we found that many B and T cell epitopes are highly conserved between SARS-CoV and SARS-CoV-2 provides a great starting point for vaccine development,” says Sette. “Vaccine strategies that specifically target these regions could generate immunity that’s not only cross-protective but also relatively resistant to ongoing virus evolution.”
Funding: The work was funded in part by the National Institute of Allergy and Infectious Diseases, a component of the National Institutes of Health through contracts 75N9301900065, 75N93019C00001 and 75N93019C00076.
Better Vaccines with CD4 and CD8-Stimulating Epitopes
When virus entities infect human cells, epitopes from any of that virus’ proteins can theoretically be bound and presented by MHC I receptors on host cell surfaces, leading to stimulation of CD4 and CD8 T cells to provoke antibody-mediated and cellular immune responses (see Figure 1).
At Immunitrack, we believe that the key to developing powerful vaccines is to combine epitopes that stimulate an antibody response with epitopes that stimulate a cellular response. However, finding out which epitopes lead to highly effective immune responses and are thus worth pursuing for vaccine development is a challenge.
There are a number of epitope prediction algorithms available but these generally only perform well for a subset of Caucasian alleles i.e. these tools are not always reliable for e.g. MHC-C subtype (HLA-C) and most MHC class II alleles.
1. Materials and Methods
- 1.1. Acquisition and Processing of Sequence Data
A total of 120 whole genome sequences of SARS-CoV-2
were downloaded on 21 February 2020 from the GISAID
database (https://www.gisaid.org/CoV2020/) (Table S1). We excluded sequences that likely had spurious
mutations resulting from sequencing errors, as indicated
in the comment field
of the GISAID data. These nucleotide sequences were aligned to the GenBank reference sequence (accession ID: NC_045512.2) and then translated into amino acid residues according to the coding sequence positions provided along the reference sequence for SARS-CoV-2 proteins (orf1a, orf1b, S, ORF3a, E, M, ORF6, ORF7a, ORF7b, ORF8, N, and ORF10). These sequences were aligned separately for each protein using the MAFFT multiple sequence alignment program . Reference protein sequences for SARS-CoV and MERS-CoV were obtained following the same procedure from GenBank using the accession IDs NC_004718.3 and NC_019843.3, respectively.
- 1.2. Acquisition and Filtering of Epitope Data
SARS-CoV-derived B cell and T cell epitopes were searched on the NIAID Virus Pathogen Database and Analysis Resource (ViPR) (https://www.viprbrc.org/; accessed 21 February 2020)  by querying for the virus species name: “Severe acute respiratory syndrome-related coronavirus” from “human” hosts. We limited our search to include only the experimentally-determined epitopes that were associated with at least one positive assay:
(i) Positive B cell assays (e.g., enzyme-linked immunosorbent assay (ELISA)-based qualitative binding) for B cell epitopes; and
(ii) either positive T cell assays (such as enzyme-linked immune absorbent spot (ELISPOT) or intracellular cytokine staining (ICS) IFN-γ release), or positive major histocompatibility complex (MHC) binding assays for T cell epitopes.
Strictly speaking, the latter set of epitopes, determined using positive MHC binding assays, are antigens which are candidate epitopes, since a T cell response has not been confirmed experimentally.
However, for brevity and to be consistent with the terminology used in the ViPR database, we will not make this qualification, and will simply refer to them as epitopes in this study. The number of B cell and T cell epitopes obtained from the database following the above procedure is listed in Table 1.
Table 1. Filtering criteria and corresponding number of Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV)-derived epitopes obtained from the Virus Pathogen Database and Analysis Resource (ViPR) database.
Filtering Criteria Number of Epitopes
|Positive T cell assays||T cell epitopes||115|
|Positive major histocompatibility complex (MHC) binding assays||T cell epitopes||959|
|Positive B cell assays||Linear B cell epitopes||298|
|Discontinuous B cell epitopes||6|
- 1.3. Population-Coverage-Based T Cell Epitope Selection
Population coverages for sets of T cell epitopes were computed using the tool provided by the Immune Epitope Database (IEDB) (http://tools.iedb.org/population/; accessed 21 February 2020) . This tool uses the distribution of MHC alleles (with at least 4-digit resolution, e.g., A*02:01) within a defined population (obtained from http://www.allelefrequencies.net/) to estimate the population coverage for a set of T cell epitopes. The estimated population coverage represents the percentage of individuals within the population that are likely to elicit an immune response to at least one T cell epitope from the set. To identify the set of epitopes associated with MHC alleles that would maximize the population coverage, we adopted a greedy approach:
(i) We first identified the MHC allele with the highest individual population coverage and initialized the set with their associated epitopes, then
(ii) we progressively added epitopes associated with other MHC alleles that resulted in the largest increase of the accumulated population coverage.
We stopped when no increase in the accumulated population coverage was observed by adding epitopes associated with any of the remaining MHC alleles.
- 1.4. Constructing the Phylogenetic Tree
We used the publicly available software PASTA v1.6.4  to construct a maximum-likelihood phylogenetic tree of each structural protein using the unique set of sequences in the available data of SARS-CoV, MERS-CoV, and SARS-CoV-2. We additionally included the Zaria Bat coronavirus strain (accession ID: HQ166910.1) to serve as an outgroup. The appropriate parameters for tree estimation are automatically selected in the software based on the provided sequence data. For visualizing the constructed phylogenetic trees, we used the publicly available software Dendroscope v3.6.3 . Each constructed tree was rooted with the outgroup Zaria Bat coronavirus strain, and circular phylogram layout was used.
- 1.5. Data and Code Availability
All sequence and immunological data, and all scripts (written in R) for reproducing the results are available online .
- 2.1. Structural Proteins of SARS-CoV-2 Are Genetically Similar to SARS-CoV, but Not to MERS-CoV
SARS-CoV-2 has been observed to be close to SARS-CoV—much more so than MERS-CoV—based on full-length genome phylogenetic analysis [9,12]. We checked whether this is also true at the level of the individual structural proteins (S, E, M, and N). A straightforward reference-sequence-based comparison indeed confirmed this, showing that the M, N, and E proteins of SARS-CoV-2 and SARS-CoV have over 90% genetic similarity, while that of the S protein was notably reduced (but still high) (Figure 1a).
The similarity between SARS-CoV-2 and MERS-CoV, on the other hand, was substantially lower for all proteins (Figure 1a); a feature that was also evident from the corresponding phylogenetic trees (Figure 1b). We note that while the former analysis (Figure 1a) was based on the reference sequence of each coronavirus, it is indeed a good representative of the virus population, since few amino acid mutations have been observed in the corresponding sequence data (Figure S1). It is also noteworthy that while MERS-CoV is the more recent coronavirus to have infected humans, and is comparatively more recurrent (causing outbreaks in 2012, 2015, and 2018) (https://www.who.int/emergencies/mers-cov/en/), SARS-CoV-2 is closer to SARS-CoV, which has not been observed since 2004.
Given the close genetic similarity between the structural proteins of SARS-CoV and SARS-CoV-2, we attempted to leverage immunological studies of the structural proteins of SARS-CoV to potentially aid vaccine development for SARS-CoV-2. We focused specifically on the S and N proteins as these are known to induce potent and long-lived immune responses in SARS-CoV [15–17,19,20,25,27]. We used the available SARS-CoV-derived experimentally-determined epitope data (see Materials and Methods) and searched to identify T cell and B cell epitopes that were identical—and hence potentially cross-reactive—across SARS-CoV and SARS-CoV-2. We first report the analysis for T cell epitopes, which have been shown to provide a long-lasting immune response against SARS-CoV , followed by a discussion of B cell epitopes.
- 1.1. Mapping the SARS-CoV-Derived T Cell Epitopes That Are Identical in SARS-CoV-2, and Determining Those with Greatest Estimated Population Coverage
The SARS-CoV-derived T cell epitopes used in this study were experimentally-determined from two different types of assays :
(i) Positive T cell assays, which tested for a T cell response against epitopes, and
(ii) positive MHC binding assays, which tested for epitope-MHC binding. We aligned these T cell epitopes across the SARS-CoV-2 protein sequences.
Among the 115 T cell epitopes that were determined by positive T cell assays (Table 1), we found that 27 epitope-sequences were identical within SARS-CoV-2 proteins and comprised no mutation in the available SARS-CoV-2 sequences (as of 21 February 2020) (Table 2). Interestingly, all of these were present in either the N (16) or S (11) protein. MHC binding assays were performed for 19 of these 27 epitopes, and these were reported to be associated with only five distinct MHC alleles (at 4-digit resolution): HLA-A*02:01, HLA-B*40:01, HLA-DRA*01:01, HLA-DRB1*07:01, and HLA-DRB1*04:01. Consequently, the accumulated population coverage of these epitopes (see Materials and Methods for details) is estimated to not be high for the global population (59.76%), and was quite low for China (32.36%). For the remaining 8 epitopes, since the associated MHC alleles are unknown, they could not be used in the population coverage computation. Additional MHC binding tests to identify the MHC alleles that bind to these 8 epitopes may reveal additional distinct alleles, beyond the five determined so far, that may help to improve population coverage.
Table 2. SARS-CoV-derived T cell epitopes obtained using positive T cell assays that are identical in SARS-CoV-2 (27 epitopes in total).
|Protein||IEDB ID||Epitope||MHC Allele1||MHC Allele Class 1|
|Table 2. Cont.|
|Protein||IEDB ID||Epitope||MHC Allele1||MHC Allele Class 1|
1 NA: Not available.
To further expand the search and identify potentially effective T cell targets covering a higher percentage of the population, we next additionally considered the set of T cell epitopes that have been experimentally-determined from positive MHC binding assays (Table 1), but, unlike the previous epitope set, their ability to induce a T cell response against SARS-CoV was not experimentally determined.
Nonetheless, they also present promising candidates for inducing a response against SARS-CoV-2. For the expanded set of epitopes, all of which have at least one positive MHC binding assay, we found that 229 epitope-sequences have an identical match in SARS-CoV-2 proteins and have associated MHC allele information available (listed in Table S2).
Of these 229 epitopes, ~82% were MHC Class I restricted epitopes (Table S3). Importantly, 102 of the 229 epitopes were derived from either the S (66) or N (36) protein. Mapping all 66 S-derived epitopes onto the resolved crystal structure of the SARS-CoV S protein (Figure S2) revealed that 3 of these (GYQPYRVVVL, QPYRVVVLSF, and PYRVVVLSF) were located entirely in the SARS-CoV receptor-binding motif (https://www.uniprot.org/uniprot/P59594), known to be important for virus cell entry .
Similar to previous studies on HIV and HCV [35–38], we estimated population coverages for various combinations of MHC alleles associated with these 102 epitopes. Our aim was to determine sets of epitopes associated with MHC alleles with maximum population coverage, potentially aiding the development of vaccines against SARS-CoV-2. For selection, we adopted a greedy computational approach (see Materials and Methods), which identified a set of T cell epitopes estimated to maximize global population coverage.
This set comprised of multiple T cell epitopes associated with 20 distinct MHC alleles and was estimated to provide an accumulated population coverage of 96.29% (Table 3).
Interestingly, the majority of the T cell epitopes for which a positive immune response has been determined using T cell assays (Table 2) were presented by the globally most-prevalent MHC allele (shown in blue color in Table 3).
Moreover, the functionally important epitopes located in the SARS-CoV receptor binding motif were associated with the second and third most-prevalent MHC alleles (underlined in Table 3).
Thus, while the ordering of T cell epitopes in Table 3 is based on the estimated global population coverage of the associated MHC alleles, it is also a natural order in which these epitopes should be tested experimentally for determining their potential to induce a positive immune response against SARS-CoV-2. We also computed the population coverage of this specific set
of epitopes in China, the country most affected by the COVID-19 outbreak, which was estimated to be slightly lower (88.11%), as certain MHC alleles (e.g., HLA-A*02:01) associated with some of these epitopes are less frequent in the Chinese population (Table 3). Repeating the same greedy approach but focusing on the Chinese population, instead of a global population, the maximum population coverage was estimated to be 92.76% (Table S4).
Table 3. Set of the SARS-CoV-derived spike (S) and nucleocapsid (N) protein T cell epitopes (obtained from positive MHC binding assays) that are identical in SARS-CoV-2 and that maximize estimated population coverage globally (87 distinct epitopes).
Due to the promiscuous nature of binding between peptides and MHC alleles, multiple S and N peptides were reported to bind to individual MHC alleles. Thus, while we list all the S and N epitopes that bind to each MHC allele (Table 3), the estimated maximum population coverage may be achieved by selecting at least one epitope for each listed MHC allele. Likewise, many individual S and N epitopes were found to be presented by multiple alleles and thereby estimated to have varying global population coverage (listed in Table S5).
- 1.1. Mapping the SARS-CoV-Derived B cell Epitopes that Are Identical in SARS-CoV-2
Similar to T cell epitopes, we used in our study the SARS-CoV-derived B cell epitopes that have been experimentally-determined from positive B cell assays . These epitopes were classified as:
(i) Linear B cell epitopes (antigenic peptides), and (ii) discontinuous B cell epitopes (conformational epitopes with resolved structural determinants).
the 298 linear B cell epitopes (Table 1) across
the SARS-CoV-2 proteins
and found that 49 epitope-sequences, all derived
from structural proteins, have an identical match and comprised no mutation
in the available SARS-CoV-2 protein
sequences (as of 21 February
2020). Interestingly, a large number (45) of these were derived
from either the S (23) or N (22) protein (Table
4), while the remaining (4) were from the M
protein (Table S6).
Table 4. SARS-CoV-derived linear B cell epitopes from S (23; 20 of which are located in subunit S2) and N (22) proteins that are identical in SARS-CoV-2 (45 epitopes in total).
On the other hand, all 6 SARS-CoV-derived discontinuous B cell epitopes obtained from the ViPR database (Table 5) were derived from the S protein. Based on the pairwise alignment between the SARS-CoV and SARS-CoV-2 reference sequences (Figure S3), we found that none of these mapped identically to the SARS-CoV-2 S protein, in contrast to the linear epitopes. For 3 of these discontinuous B cell epitopes (corresponding to antibodies S230, m396, and 80R [39–41]), there was a partial mapping, with at least one site having an identical residue at the corresponding site in the SARS-CoV-2 S protein (Table 5).
Table 5. SARS-CoV-derived discontinuous B cell epitopes (and associated known antibodies [39–41]) that have at least one site with an identical amino acid to the corresponding site in SARS-CoV-2.
Mapping the residues of the linear and discontinuous B cell epitopes onto the available structure of the SARS-CoV S protein revealed their distinct association with the two functional subunits of the S protein : S1, important for interaction with the host cell receptor, and S2, involved in fusion of the cellular and virus membranes (Figure 2a). Specifically, 20 of the 23 linear epitopes (Table 4) mapped to S2 (Figure 2b). Thus, the antibodies targeting the identified linear epitopes in the S2 subunit might cross-react and neutralize both SARS-CoV and SARS-CoV-2, as suggested in a very recent study .
While S2 is comparatively less exposed than S1, it may be accessible to antibodies during the complex conformational changes involved in viral entry of coronaviruses [44–46]; though this remains to be more clearly understood. In contrast, the 3 discontinuous B cell epitopes (Table 5) mapped onto the more exposed S1 subunit (Figure 2c, left panel), which contains the receptor-binding motif of the SARS-CoV S protein . We observed that very few residues of the 3 discontinuous epitopes were identical within SARS-CoV and SARS-CoV-2 (Figure 2c, right panel). These differences suggest that the SARS-CoV-specific antibodies S230, m396, and 80R known to bind to these epitopes in SARS-CoV might not be able to bind to the same regions in SARS-CoV-2 S protein. Interestingly, while this paper was under review, this has been confirmed experimentally . Further studies are currently under way to identify other SARS-CoV antibodies that may bind to discontinuous epitopes of the SARS-CoV-2 S protein .
Supplementary Materials: The following are available online at http://www.mdpi.com/1999-4915/12/3/254/s1, Figure S1: Fraction of mutations in the observed sequences of the structural proteins of the three coronaviruses, Figure S2: Location of identified T cell epitopes on the SARS-CoV S protein structure (PDB ID: 5XLR), Figure S3: Pairwise sequence alignment of the reference sequences of the S proteins of SARS-CoV and SARS-CoV-2 (accession ID: NP_828851.1 and YP_009724390.1, respectively), Table S1: List of GISAID accession IDs for 120 genomic sequences of SARS-CoV-2, Table S2: List of all SARS-CoV-derived T cell epitopes determined using positive MHC binding assays (with associated MHC allele information available at 4 digit resolution) and found to be identical in SARS-CoV-2, Table S3: Distribution of all SARS-CoV-derived T cell epitopes obtained using positive MHC binding assays (with associated MHC allele information available at 4 digit resolution) and that are identical in SARS-CoV-2, Table S4: Set of SARS-CoV-derived S and N protein T cell epitopes (obtained using positive MHC binding assays) that are identical in SARS-CoV-2 and that maximize estimated population coverage in China (86 distinct epitopes), Table S5: Estimated global and Chinese population coverages for the individual SARS-CoV-derived S or N protein T cell epitopes (obtained using positive MHC binding assays), that are identical in SARS-CoV-2, Table S6: SARS-CoV-derived linear B cell epitopes, excluding those in S and N proteins, that are identical in SARS-CoV-2, Table S7: Acknowledgment table.
Author Contributions: Conceptualization: S.F.A., A.A.Q., and M.R.M.; methodology: S.F.A., A.A.Q., and M.R.M.; software: S.F.A. and A.A.Q.; validation: S.F.A. and A.A.Q.; formal analysis: S.F.A., A.A.Q., and M.R.M.; investigation: S.F.A., A.A.Q., and M.R.M.; resources: M.R.M.; data curation: S.F.A.; writing—original draft preparation: S.F.A., A.A.Q., and M.R.M.; writing—review and editing: S.F.A., A.A.Q., and M.R.M.; visualization:
S.F.A. and A.A.Q.; supervision: A.A.Q. and M.R.M.; project administration: A.A.Q. and M.R.M.; funding acquisition: M.R.M. All authors have read and agreed to the published version of the manuscript.
Acknowledgments: We thank all the authors, the originating and submitting laboratories (listed in Table S7) for their sequence and metadata shared through GISAID, on which this research is based. M.R.M. and A.A.Q. were supported by the General Research Fund of the Hong Kong Research Grants Council (RGC) [Grant No. 16204519].
S.F.A. was supported by the Hong Kong Ph.D. Fellowship Scheme (HKPFS).
Conflicts of Interest: The authors declare no conflict of interest.
- 1. Wang, C.; Horby, P.W.; Hayden, F.G.; Gao, G.F. A novel coronavirus outbreak of global health concern. Lancet 2020, 395, 470–473. [CrossRef]
- Centers-of-Disease-Control-and-Prevention Confirmed 2019-nCoV cases globally. Available online: https://www.cdc.gov/coronavirus/2019-ncov/locations-confirmed-cases.html (accessed on 31 January 2020).
- 3. World-Health-Organization Statement on the second meeting of the International Health Regulations (2005) Emergency Committee regarding the outbreak of novel coronavirus (2019-nCoV). Available online: https://www.who.int/news-room/detail/30-01-2020-statement-on-the-second-meeting-of- the-international-health-regulations-(2005)-emergency-committee-regarding-the-outbreak-of-novel- coronavirus-(2019-ncov) (accessed on 31 January 2020).
- World-Health-Organization Coronavirus disease (COVID-19) outbreak. Available online: https://www.who. int/emergencies/diseases/novel-coronavirus-2019 (accessed on 31 January 2020).
- World-Health-Organization Statement on the meeting of the International Health Regulations (2005) Emergency Committee regarding the outbreak of novel coronavirus (2019-nCoV). Available online: https://www.who.int/news-room/detail/23-01-2020-statement-on-the-meeting-of-the-international-health- regulations-(2005)-emergency-committee-regarding-the-outbreak-of-novel-coronavirus-(2019-ncov) (accessed on 31 January 2020).
- Huang, C.; Wang, Y.; Li, X.; Ren, L.; Zhao, J.; Hu, Y.; Zhang, L.; Fan, G.; Xu, J.; Gu, X.; et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet 2020, 395, 497–506. [CrossRef]
- Heymann, D.L. Data sharing and outbreaks: Best practice exemplified. Lancet 2020, 395, 469–470. [CrossRef]
- Liu, X.; Wang, X.-J. Potential inhibitors for 2019-nCoV coronavirus M protease from clinically approved medicines. bioRxiv 2020, 2020.01.29.924100.
- Zhou, P.; Yang, X.-L.; Wang, X.-G.; Hu, B.; Zhang, L.; Zhang, W.; Si, H.-R.; Zhu, Y.; Li, B.; Huang, C.-L.; et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature 2020. [CrossRef] [PubMed]
- World-Health-Organization Update 49 – SARS case fatality ratio, incubation period. Available online: https://www.who.int/csr/sars/archive/2003_05_07a/en/ (accessed on 31 January 2020).
- World-Health-Organization Middle East respiratory syndrome coronavirus (MERS-CoV). Available online: https://www.who.int/emergencies/mers-cov/en/ (accessed on 31 January 2020).
- Lu, R.; Zhao, X.; Li, J.; Niu, P.; Yang, B.; Wu, H.; Wang, W.; Song, H.; Huang, B.; Zhu, N.; et al. Genomic characterisation and epidemiology of 2019 novel coronavirus: Implications for virus origins and receptor binding. Lancet 2020, 6736, 1–10. [CrossRef]
- Letko, M.; Munster, V. Functional assessment of cell entry and receptor usage for lineage B β-coronaviruses, including 2019-nCoV. bioRxiv 2020, 2020.01.22.915660.
- Hoffmann, M.; Kleine-Weber, H.; Kruger, N.; Muller, M.; Drosten, C.; Pohlmann, S. The novel coronavirus 2019 (2019-nCoV) uses the SARS-coronavirus receptor ACE2 and the cellular protease TMPRSS2 for entry into target cells. bioRxiv 2020, 2020.01.31.929042.
- Yang, Z.-Y.; Kong, W.-P.; Huang, Y.; Roberts, A.; Murphy, B.R.; Subbarao, K.; Nabel, G.J. A DNA vaccine induces SARS coronavirus neutralization and protective immunity in mice. Nature 2004, 428, 561–564. [CrossRef]
- Deming, D.; Sheahan, T.; Heise, M.; Yount, B.; Davis, N.; Sims, A.; Suthar, M.; Harkema, J.; Whitmore, A.; Pickles, R.; et al. Vaccine efficacy in senescent mice challenged with recombinant SARS-CoV bearing epidemic and zoonotic spike variants. PLoS Med. 2006, 3, e525. [CrossRef]
- Graham, R.L.; Becker, M.M.; Eckerle, L.D.; Bolles, M.; Denison, M.R.; Baric, R.S. A live, impaired-fidelity coronavirus vaccine protects in an aged, immunocompromised mouse model of lethal disease. Nat. Med. 2012, 18, 1820–1826. [CrossRef] [PubMed]
- Lin, Y.; Shen, X.; Yang, R.F.; Li, Y.X.; Ji, Y.Y.; He, Y.Y.; De Shi, M.; Lu, W.; Shi, T.L.; Wang, J.; et al. Identification of an epitope of SARS-coronavirus nucleocapsid protein. Cell Res. 2003, 13, 141–145. [CrossRef] [PubMed]
- Wang, J.; Wen, J.; Li, J.; Yin, J.; Zhu, Q.; Wang, H.; Yang, Y.; Qin, E.; You, B.; Li, W.; et al. Assessment of immunoreactive synthetic peptides from the structural proteins of severe acute respiratory syndrome coronavirus. Clin. Chem. 2003, 49, 1989–1996. [CrossRef] [PubMed]
- Liu, X.; Shi, Y.; Li, P.; Li, L.; Yi, Y.; Ma, Q.; Cao, C. Profile of antibodies to the nucleocapsid protein of the severe acute respiratory syndrome (SARS)-associated coronavirus in probable SARS patients. Clin. Vaccine Immunol. 2004, 11, 227–228. [CrossRef] [PubMed]
- Tang, F.; Quan, Y.; Xin, Z.-T.; Wrammert, J.; Ma, M.-J.; Lv, H.; Wang, T.-B.; Yang, H.; Richardus, J.H.; Liu, W.; et al. Lack of peripheral memory B cell responses in recovered patients with severe acute respiratory syndrome: A six-year follow-up study. J. Immunol. 2011, 186, 7264–7268. [CrossRef]
- Peng, H.; Yang, L.-T.; Wang, L.-Y.; Li, J.; Huang, J.; Lu, Z.-Q.; Koup, R.A.; Bailer, R.T.; Wu, C.-Y. Long-lived memory T lymphocyte responses against SARS coronavirus nucleocapsid protein in SARS-recovered patients. Virology 2006, 351, 466–475. [CrossRef]
- Fan, Y.-Y.; Huang, Z.-T.; Li, L.; Wu, M.-H.; Yu, T.; Koup, R.A.; Bailer, R.T.; Wu, C.-Y. Characterization of SARS-CoV-specific memory T cells from recovered individuals 4 years after infection. Arch. Virol. 2009, 154, 1093–1099. [CrossRef]
- Ng, O.-W.; Chia, A.; Tan, A.T.; Jadi, R.S.; Leong, H.N.; Bertoletti, A.; Tan, Y.-J. Memory T cell responses targeting the SARS coronavirus persist up to 11 years post-infection. Vaccine 2016, 34, 2008–2014. [CrossRef]
- Liu, W.J.; Zhao, M.; Liu, K.; Xu, K.; Wong, G.; Tan, W.; Gao, G.F. T-cell immunity of SARS-CoV: Implications for vaccine development against MERS-CoV. Antiviral Res. 2017, 137, 82–92. [CrossRef]
- Li, C.K.-F.; Wu, H.; Yan, H.; Ma, S.; Wang, L.; Zhang, M.; Tang, X.; Temperton, N.J.; Weiss, R.A.; Brenchley, J.M.; et al. T cell responses to whole SARS coronavirus in humans. J. Immunol. 2008, 181, 5490–5500. [CrossRef]
- Channappanavar, R.; Fett, C.; Zhao, J.; Meyerholz, D.K.; Perlman, S. Virus-specific memory CD8 T cells provide substantial protection from lethal severe acute respiratory syndrome coronavirus infection. J. Virol. 2014, 88, 11034–11044. [CrossRef] [PubMed]
- Katoh, K.; Standley, D.M. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 2013, 30, 772–780. [CrossRef] [PubMed]
- Pickett, B.E.; Sadat, E.L.; Zhang, Y.; Noronha, J.M.; Squires, R.B.; Hunt, V.; Liu, M.; Kumar, S.; Zaremba, S.; Gu, Z.; et al. ViPR: An open bioinformatics database and analysis resource for virology research. Nucleic Acids Res. 2012, 40, D593–D598. [CrossRef] [PubMed]
- Vita, R.; Mahajan, S.; Overton, J.A.; Dhanda, S.K.; Martini, S.; Cantrell, J.R.; Wheeler, D.K.; Sette, A.; Peters, B. The immune epitope database (IEDB): 2018 update. Nucleic Acids Res. 2019, 47, D339–D343. [CrossRef]
- Mirarab, S.; Nguyen, N.; Guo, S.; Wang, L.-S.; Kim, J.; Warnow, T. PASTA: Ultra-large multiple sequence alignment for nucleotide and amino-acid sequences. J. Comput. Biol. 2015, 22, 377–386. [CrossRef]
- Huson, D.H.; Scornavacca, C. Dendroscope 3: An interactive tool for rooted phylogenetic trees and networks.
Syst. Biol. 2012, 61, 1061–1067. [CrossRef]
- Ahmed, S.F. Data and software code for reproducing results of this paper. Available online: https:
//github.com/faraz107/2019-nCoV-T-Cell-Vaccine-Candidates (accessed on 31 January 2020).
- Li, F. Structure of SARS coronavirus spike receptor-binding domain complexed with receptor. Science. 2005,
309, 1864–1868. [CrossRef]
- Dahirel, V.; Shekhar, K.; Pereyra, F.; Miura, T.; Artyomov, M.; Talsania, S.; Allen, T.M.; Altfeld, M.; Carrington, M.; Irvine, D.J.; et al. Coordinate linkage of HIV evolution reveals regions of immunological vulnerability. Proc. Natl. Acad. Sci. 2011, 108, 11530–11535. [CrossRef]
- Quadeer, A.A.; Louie, R.H.Y.; Shekhar, K.; Chakraborty, A.K.; Hsing, I.-M.; McKay, M.R. Statistical linkage analysis of substitutions in patient-derived sequences of genotype 1a hepatitis C virus nonstructural protein 3 exposes targets for immunogen design. J. Virol. 2014, 88, 7628–7644. [CrossRef]
- Ahmed, S.F.; Quadeer, A.A.; Morales-Jimenez, D.; McKay, M.R. Sub-dominant principal components inform new vaccine targets for HIV Gag. Bioinformatics 2019, 35, 3884–3889. [CrossRef]
- Quadeer, A.A.; Morales-Jimenez, D.; McKay, M.R. Co-evolution networks of HIV/HCV are modular with direct association to structure and function. PLOS Comput. Biol. 2018, 14, e1006409. [CrossRef]
- Prabakaran, P.; Gan, J.; Feng, Y.; Zhu, Z.; Choudhry, V.; Xiao, X.; Ji, X.; Dimitrov, D.S. Structure of severe acute respiratory syndrome coronavirus receptor-binding domain complexed with neutralizing antibody. J. Biol. Chem. 2006, 281, 15829–15836. [CrossRef] [PubMed]
- Zhu, Z.; Chakraborti, S.; He, Y.; Roberts, A.; Sheahan, T.; Xiao, X.; Hensley, L.E.; Prabakaran, P.; Rockx, B.; Sidorov, I.A.; et al. Potent cross-reactive neutralization of SARS coronavirus isolates by human monoclonal antibodies. Proc. Natl. Acad. Sci. 2007, 104, 12123–12128. [CrossRef]
- Hwang, W.C.; Lin, Y.; Santelli, E.; Sui, J.; Jaroszewski, L.; Stec, B.; Farzan, M.; Marasco, W.A.; Liddington, R.C. Structural basis of neutralization by a human anti-severe acute respiratory syndrome spike protein antibody, 80R. J. Biol. Chem. 2006, 281, 34610–34616. [CrossRef] [PubMed]
- UniProt UniProtKB – P59594 (SPIKE_CVHSA). Available online: https://www.uniprot.org/uniprot/P59594 (accessed on 31 January 2020).
- Walls, A.C.; Park, Y.-J.; Tortorici, M.A.; Wall, A.; McGuire, A.T.; Veesler, D. Structure, function and antigenicity of the SARS-CoV-2 spike glycoprotein. bioRxiv 2020, 2020.02.19.956581.
- 44. Walls, A.C.; Xiong, X.; Park, Y.-J.; Tortorici, M.A.; Snijder, J.; Quispe, J.; Cameroni, E.; Gopal, R.; Dai, M.; Lanzavecchia, A.; et al. Unexpected receptor functional mimicry elucidates activation of coronavirus fusion. Cell 2019, 176, 1026–1039.e15. [CrossRef] [PubMed]
- Walls, A.C.; Tortorici, M.A.; Snijder, J.; Xiong, X.; Bosch, B.-J.; Rey, F.A.; Veesler, D. Tectonic conformational changes of a coronavirus spike glycoprotein promote membrane fusion. Proc. Natl. Acad. Sci. 2017, 114, 11157–11162. [CrossRef] [PubMed]
- Song, W.; Gui, M.; Wang, X.; Xiang, Y. Cryo-EM structure of the SARS coronavirus spike glycoprotein in complex with its host cell receptor ACE2. PLOS Pathog. 2018, 14, e1007236. [CrossRef]
- Wrapp, D.; Wang, N.; Corbett, K.S.; Goldsmith, J.A.; Hsieh, C.-L.; Abiona, O.; Graham, B.S.; McLellan, J.S. Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation. Science. 2020, 2011, eabb2507. [CrossRef]
- Tian, X.; Li, C.; Huang, A.; Xia, S.; Lu, S.; Shi, Z.; Lu, L.; Jiang, S.; Yang, Z.; Wu, Y.; et al. Potent binding of 2019 novel coronavirus spike protein by a SARS coronavirus-specific human monoclonal antibody. Emerg. Microbes Infect. 2020, 9, 382–385. [CrossRef]
- Ferguson, A.L.; Mann, J.K.; Omarjee, S.; Ndung’u, T.; Walker, B.D.; Chakraborty, A.K. Translating HIV sequences into quantitative fitness landscapes predicts viral vulnerabilities for rational immunogen design. Immunity 2013, 38, 606–617. [CrossRef] [PubMed]
- Chakraborty, A.K.; Barton, J.P. Rational design of vaccine targets and strategies for HIV: A crossroad of statistical physics, biology, and medicine. Reports Prog. Phys. 2017, 80, 032601. [CrossRef] [PubMed]
- Quadeer, A.A.; Louie, R.H.Y.; McKay, M.R. Identifying immunologically-vulnerable regions of the HCV E2 glycoprotein and broadly neutralizing antibodies that target them. Nat. Commun. 2019, 10, 2073. [CrossRef] [PubMed]
- Louie, R.H.Y.; Kaczorowski, K.J.; Barton, J.P.; Chakraborty, A.K.; McKay, M.R. Fitness landscape of the human immunodeficiency virus envelope protein that is targeted by antibodies. Proc. Natl. Acad. Sci. 2018, 115, E564–E573. [CrossRef]
- Quadeer, A.A.; Barton, J.P.; Chakraborty, A.K.; McKay, M.R. Deconvolving mutational patterns of poliovirus outbreaks reveals its intrinsic fitness landscape. Nat. Commun. 2020, 11, 377. [CrossRef]
- Mann, J.K.; Barton, J.P.; Ferguson, A.L.; Omarjee, S.; Walker, B.D.; Chakraborty, A.; Ndung’u, T. The fitness landscape of HIV-1 Gag: Advanced modeling approaches and validation of model predictions by in vitro testing. PLoS Comput. Biol. 2014, 10, e1003776. [CrossRef]
- Ramaiah, A.; Arumugaswami, V. Insights into cross-species evolution of novel human coronavirus 2019-nCoV and defining immune determinants for vaccine development. bioRxiv 2020, 2020.01.29.925867.
La Jolla Institute