The study findings were published in the peer reviewed journal: Nature Communications. https://www.nature.com/articles/s41467-022-31999-6
Multiple host genetic variants have been identified 12,13,14,15 that predispose SARS-CoV-2 infected individuals to a severe course of COVID-19, including hospitalisation and risk of death, pointing to causal mechanisms. To translate these findings into clinical management or the identification of novel drug targets and repurposing opportunities, a deep understanding of the involved causal genes is needed.
We identified six candidate causal genes and their proteins by refining known risk loci (ELF5, SFTPD) and by prioritising suggestive loci (CSF3, RAB2A, HSP40, NUDT5) through the integration of plasma proteomics. We demonstrate that the strongest and most robust candidate, ELF5 (associated with a >4-fold higher risk to develop severe COVID-19), is specifically expressed in primary target cells of SARS-CoV-2 (for example, sustentacular61, AT246, and secretory or ciliated epithelial cells62) with evidence of co-expression with genes encoding key host factors, such as ACE2 and TMPRSS2, using scnRNAseq data across various sites of the respiratory system.
We further find genetically anchored evidence that aligns with a recent clinical trial60 suggesting human recombinant granulocyte colony-stimulating factor (G-CSF) as a potential treatment option among patients with COVID-19 and severe lymphopenia to mitigate adverse outcomes.
ELF5 is a member of the erythroblast transformation-specific (Ets) transcription factor family and is best known for its possible role in breast or prostate cancer, tissues with high fractions of epithelial cells63,64, and less for its possible role in lung development 55,65 and possibly cystic fibosis 66.
Early work in lung tissue cultures and mouse models described a dynamic expression pattern of Elf5 during embryogenesis and lung branching, including almost complete downregulation in distal lung postnatally, while residual expression in proximal airways persisted55,65.
Overexpression of Elf5 during early but not late embryonal development (after E16.5) caused a severe cystic lung phenotype characterised by disrupted branching and a dilated airway epithelium 55, characteristics that are also seen in autopsies of COVID-19 patients 45. While such a drastic intervention in mouse models is not comparable to the subtle effect of a common genetic variant, the observation that key host factors for SARS-CoV-2 (Ace2 and Tmprss4) are upregulated in Elf5-overexpressing AT2 cells partly aligns with our observations using scRNAseq data.
The role of ELF5 in secretory and AT2 cells of the airway and alveolar epithelium, respectively, may have potential implications to the wound healing response. As cells with stem-like capacity, they are involved in the maintenance and repair of their respective cellular niches 49,69.
Thus, any surviving secretory and AT2 cells that drive the repopulation of the epithelium could potentially have aberrant repair programmes mediated by ELF5 and therefore possibly rs766826. An accumulation of AT2 cells in a regenerative transitional cell state has recently been suggested for COVID-1917.
Up to 60% of COVID-19 patients report transient anosmia70. The underlying aetiology, however, remains largely elusive. Direct infection and hence damage of olfactory sensory neurons by SARS-CoV-2 could be one obvious explanation. Viral particles have been shown to be present in neuronal cells of the olfactory mucosa possibly presenting a route for CNS infection 71, however, the generally undetectable expression levels of ACE2 in those cells make them an unlikely primary target compared to, for example, epithelial cells61.
Previous studies suggested that the loss of essential supporting cells, sustentacular cells, in the olfactory mucosa causes anosmia 61,72. Sustentacular cells have been suggested as primary targets of SARS-CoV-2 based on high ACE2 expression 46,61,73, supported by in vivo models showing a high viral load and rapid desquamation of the olfactory epithelium following infection 74,75.
A finding in line with our observations from samples of COVID-19 patients. Our observation that sustentacular cells, as well as other secretory epithelial cells in the olfactory mucosa, express high levels of ELF5 along with a possible link to ACE2 expression, might indicate a possible modulating role of ELF5 expression for this common symptom. However, a recent genome-wide association study (GWAS) for anosmia76 among self-reported COVID-19 cases did not yet identify rs766826 and hence ELF5 expression. Larger GWAS for anosmia and functional studies are needed to clarify a possible role of ELF5 in the onset of anosmia during SARS-CoV-2 infection.
We provide genetically anchored evidence that people with higher plasma G-CSF abundances are less likely to develop severe COVID-19, suggesting a possible protective effect possibly via early recruitment of neutrophils to the entry sites of SARS-CoV-243. Colony-stimulating factors, such as G-CSF, are haematopoietic growth factors and are actively investigated as treatment options for COVID-19 77.
A recent open-label, multicentre, randomised clinical trial60 evaluated the efficacy of rhG-CSF to improve symptoms among 200 COVID-19 patients with lymphopenia (lymphocyte cell count <800 per µL) but without comorbidities. While no significant effect on the primary endpoint (time to improvement) was detected, patients treated with rhG-CSF experienced significantly fewer severe adverse effects, including respiratory failure, acute respiratory distress symptoms, sepsis, or septic schock 60.
The treatment effect seemed further dependent on baseline lymphocyte counts, with patients <400 per µL benefiting the most. However, leucocytosis was common in the treatment arm, including severe cases. We note, that our results and the trial are in stark contrast to observational studies associating higher G-CSF plasma levels 78,79 and rhG-CSF treatment among cancer patients with a poor prognosis 80,81, possibly explained by the inability to distinguish cause and effect.
Candidate proteins highlighted in the present study might generally act via two distinct mechanisms. Firstly, they may increase/decrease the susceptibility of getting infected with SARS-CoV-2 in the first place, which is also the most powered outcome investigated by the COVID-19 HGI. The effect of BGAT encoded by ABO falls most likely into this category15. Secondly, once patients are sufficiently infected, host proteins might contribute to exaggerated replication/spreading of the virus into different organ systems or contribute to the hyperinflammatory response seen in many severe COVID-19 cases with subsequent injury, and possibly failure, of multiple organ systems, including the lung.
We observed at least 3-times higher effect estimates of candidate genetic variants for severe COVID-19 compared to testing positive for SARS-CoV-2 for all remaining candidate proteins (Supplemental Data 5), making them likely candidates to contribute to disease severity, which was supported by analysis from the COVID-19 HGI for ELF5, OAS1, and SFTP15.
Apart from the (refined) annotation of causal genes at known risk loci, establishing a shared signal across different molecular layers and COVID-19 subthreshold findings can reveal yet-to-be-identified risk genes and proteins. For example, we identified RAB2A, encoding Ras-related protein Rab2A, as a suggestive causal gene for severe COVID-19, which has only been identified as a genome-wide significant locus while this paper was under review with substantially larger case numbers82. While other findings, including CSF3 (encoding G-CSF), still warrant statistical identification at genome-wide significance for COVID-19 outcomes before being unambiguously declared as genetic risk locus, we argue that establishing convergence of different biological entities at a genetic signal can greatly increase confidence in the plausibility of findings.
For example, out of all findings at the CSF3 locus only the most powered once, that is, white blood cell counts, reach genome-wide significance, although the cluster identified using multi-trait colocalisation aligns with the known biology of G-CSF as a myelopoietic growth factor and was further supported by external trial evidence.
Although the GWAS summary statistics from the COVID-19 HGI represent multiple ancestries and the signal at the ELF5 locus has recently been replicated in a Brazilian cohort83, the pQTL instruments are based on a single ancestry and genetic studies of plasma abundances of proteins in other ancestries may reveal additional candidate proteins, that may help to explain the variable prevalence of adverse COVID-19 outcomes across ethnicities8. We obtained some evidence that rs766826 might act through a mechanism that is possibly unique to AT2 cells based on an open chromatin region, the concrete underlying mechanism, however, remains elusive. Further studies are needed to decipher the role of rs766826 in the cell-type specific expression of ELF5.
The same holds true for the suggested mechanisms of action for ELF5, for example, co-expression with and possibly regulation of ACE2 or TMPRSS2, that need to be tested in appropriate cellular and animal models, also to investigate the role of ELF5 in tissues of the respiratory system more in general.
Although our results started with the investigation of proteins measured in plasma and might hence provide possible biomarkers for severe COVID-19 in a clinical setting, we did not identify concordant associations based on plasma proteomic profiling for most of the candidates in public data sets 31,84. This likely reflects the general segregation of proteins that possibly cause a more severe outcome of COVID-19 than those being a consequence of SARS-CoV-2 infection and COVID-19.
We note that while MR can indicate the direction of effects, estimates should be interpreted with caution when plasma/blood is not the tissue of action of the protein or if cis-pQTL(s) can be linked to protein-altering variants or splicing event QTLs32. These effects, along with a possible general moderate biological effect, might have contributed to the small effect sizes for BGAT (linked to a splicing QTL) or SFTPD (the cis-pQTL, rs721917, being a missense variant, p.M31T). Finally, while we introduced filters on top of a high PP for a shared signal to ensure robust candidate proteins, correction for multiple testing in statistical colocalisation is still an area of debate and further developments are needed.
Our results demonstrate potential modulators for a poor prognosis among COVID-19 patients with potential therapeutic options. We highlight ELF5 as a potential regulator in cells that are the primary targets of SARS-CoV-2 by combining population-level genetic evidence with gene expression at single-cell resolution, providing a tangible hypothesis for further functional follow-up studies to investigate the role of ELF5 for viral entry and wound healing of the epithelial layer of the respiratory system upon severe COVID-19.
[…] ELF5 is a risk gene for severe COVID-19 […]