The study findings were published on a preprint server and are currently being peer reviewed. https://www.medrxiv.org/content/10.1101/2021.12.08.21267433v1
There is marked heterogeneity in the clinical manifestations of coronavirus disease 2019 (COVID-19), which is caused by infection with the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2).
Symptoms can range from mild-flu like symptoms to severe respiratory failure requiring supplemental oxygen, intubation or intensive care unit (ICU) care and multiple distinct cardiovascular complications have also been identified1.
Demographic factors and existing clinical comorbidities are associated with severe COVID-19. For example, age, male gender, Black and South Asian ancestry, diabetes, obesity and chronic lung disease are associated with increased risk of COVID-19-related mortality2.
In June 2020, using a genomewide association study (GWAS) analyses of common genetic variants, the Severe COVID GWAS Group first identified two single nucleotide polymorphisms (SNPs) with genome-wide significance for severe COVID infection using a meta-analysis of 1,610 participants with severe COVID-19 with respiratory failure and 2,205 healthy controls across seven hospitals in Italy and Spain3.
The first SNP was rs11385942 at locus 3p21.31 with a signal spanning multiple genes including chemokine receptors: SLC6A20, LZTFL1, CCR9, FYCO1, CXRC6 and XCR1. The second SNP was rs657152 at locus 9q34.2, coinciding with the ABO blood locus group, with analyses showing greater risk of severe COVID in type A blood carriers and protective effects in individuals with type O blood group.
Further, a small study of 4 young male patients without chronic disease treated for severe COVID-19 requiring mechanical ventilation and ICU care analyzed rare variants and found loss-of-function variants in TLR7 on the X-chromosome with associated impairment in type I and type II interferon (IFN) responses4.
In December 2020, the Genetics Of Mortality In Critical Care (GenOMICC) genome-wide association study of 2,244 critically ill patients across 208 United Kingdom (UK) ICUs identified and replicated four additional genome-wide significant signals5. Most recently in July, 2021, the COVID-19 Host Genetics Initiative has identified ten distinct genome-wide significant loci associated with severe COVID- 19 and confirmed the findings of these earlier studies6.
Given that greater comorbidities have also been observed in patients with severe COVID-19 infection we aimed to identify association between a wide range of comorbidities for these same genetic loci associated with severe COVID-19, with the goal of better understanding potential genetic risk of severe COVID-19 mediated by these variants.
Phenome-wide association study (PheWAS) has emerged as an unbiased approach to identify novel associations of previously identified, disease-associated genetic variants, across many phenotypes. One such PheWAS study has been conducted for the 3p21.31 locus7, however, additional phenotypic associations and broader implications of risk for additional identified COVID-19 genetic loci have not yet been described.
Using an unbiased PheWAS approach to clinical diagnoses in a large dataset of genetic and electronic health record data, we have identified novel phenotypic associations with the risk alleles from four of ten loci previously identified as associated with severe COVID-19 infection.
These associations could suggest that individuals carrying these genetic markers, known for their role in blood traits, host anti-viral response and inflammation, may have modified risk of cardiovascular disease, as well as auto-immune and inflammatory disorders including arthropathies and endocrinopathies, which in turn increases risk of severe COVID-19.
Alternatively, these genetic risk loci may have pleiotropic effects on these diseases and on COVID-19 related complications.
No prior phenotypic associations are published for the rs72711165 SNP near TMEM65. TMEM65 is a mitochondrial inner-membrane protein that may play a role in mitochondrial respiration and cardiac development and function. Mutations in TMEM65 have been described to cause mitochondrial myopathy and neurologic disease14. Direct mechanisms related to the association with vascular dementia identified here are unclear, but warrant further investigation.
Prior to being identified as a risk variant for severe COVID-19, rs657152 within the ABO blood locus group, had been associated with hypercoagulable state, arterial embolism and thrombosis and other disorders of circulatory system11. We validated these previously reported associations for the rs657152 SNP and identified novel associations including with greater odds of heart failure, diabetes mellitus, and hypercholesterolemia and lower odds of gastrointestinal disorders including duodenal ulcer and duodenitis.
Genetic predisposition for these cardiovascular and endocrine phenotypes may amplify the risk of adverse COVID-19 outcomes but may also have broader long-term health implications15. Taken together these associations add support to risk factors contributing to a hypercoagulable state, as both the rs657152 risk allele and COVID-19 infection itself may increase risk of via multiple mechanisms of thrombosis16.
Mutations in KANSL1 are known to cause neurodevelopmental delay disorders described within 17q21.31 deletion syndrome or Koolen-de Vries syndrome17. KANSL1 plays a role in histone acetylation, microtubule stabilization and mitochondrial respiration18. Here, we identified novel associations with the rs1819040 SNP near KANSL1, including greater odds of atrial fibrillation, hypothyroidism and glaucoma, and interestingly, lower odds of postinflammatory pulmonary fibrosis. Biologic mechanisms linking these associations, as well as the risk of severe COVID-19, warrant further study.
We also found that rs74956615 near the TYK2 gene was associated with lower odds of psoriasis and related disorders, rheumatoid arthritis and thyrotoxicosis, as well as greater odds of tobacco use disorder.
Adding strength to the results for rs74956615, these findings nominally validated in the CATHGEN cohort. TYK2, a member of the Janus Kinase (JAK) family, is involved in interleukin-23 (IL-23) signaling, a cascade associated with psoriasis via Th17 responses and IFN-α signaling.
Therapeutic targeting of JAK signaling and TYK2 is implicated in auto-immune and inflammatory diseases including both psoriasis and rheumatoid arthritis19. Nine prior SNPs in TYK2 have been reported in association with autoimmune diseases including psoriasis, rheumatoid arthritis, systemic lupus erythematosus (SLE) and inflammatory bowel disease (IBD), however, these studies have had mixed directions of effect.
A recent systematic review and meta-analysis identified protective effects against autoimmune disease for five TYK2 SNPs and risk for SLE associated with one20. Rare coding variants found to have protective effects have been associated with reductions in IL-23 and
IFN-α signaling associated with these rare coding variants21. Here for first time we show decreased odds of psoriasis associated with rs74956615, which may implicate a distinct impact of this allele on TYK2 gene function from what has been previously identified in prior GWAS analysis of psoriasis. Notably previous investigators studying the protective impact of TYK2 variants on autoimmune disease did not identify pleiotropic effects via PheWAS analyses22 and the associations of TYK2 and thyroid disease found in the present analyses have not been previously reported, however the utilization of the UK Biobank cohort represents the largest analysis of TYK2 variants to date. Our study design does not allow for more detailed confirmation of whether the reported cases of hypothyroidism may have been autoimmune in etiology.
Though findings did not reach prespecified significance thresholds in the present analyses, the other identified COVID-19-related genetic variants suggest the importance of host antiviral defense mechanisms and inflammatory signaling. Zhou et al performed a PheWAS of 310,999 European individuals in the UK Biobank and identified blood cell traits including monocyte and eosinophil count to be associated with the 3p21.31 locus7.
Though findings at other loci were only nominally associated, these findings may still be suggestive of relevant phenotypic and molecular pathways for these genetic loci and warrant further investigation in more clinical and pre-clinical models. Nominal associations for lower odds of eosinophilia corroborate the recent findings by Zhou et al for the 3p21.31 SNP rs113859427.
In genetics, a locus (plural loci) is a specific, fixed position on a chromosome where a particular gene or genetic marker is located. Each chromosome carries many genes, with each gene occupying a different position or locus; in humans, the total number of protein-coding genes in a complete haploid set of 23 chromosomes is estimated at 19,000–20,000.
Genes may possess multiple variants known as alleles, and an allele may also be said to reside at a particular locus. Diploid and polyploid cells whose chromosomes have the same allele at a given locus are called homozygous with respect to that locus, while those that have different alleles at a given locus are called heterozygous.
The ordered list of loci known for a particular genome is called a gene map. Gene mapping is the process of determining the specific locus or loci responsible for producing a particular phenotype or biological trait. Association mapping, also known as “linkage disequilibrium mapping”, is a method of mapping quantitative trait loci (QTLs) that takes advantage of historic linkage disequilibrium to link phenotypes (observable characteristics) to genotypes (the genetic constitution of organisms), uncovering genetic associations.
- Wood, E.J. (1995). “The encyclopedia of molecular biology”. Biochemical Education. 23 (2): 1165. doi:10.1016/0307-4412(95)90659-2.
- Ezkurdia, Iakes; Juan, David; Rodriguez, Jose Manuel; Frankish, Adam; Diekhans, Mark; Harrow, Jennifer; Vazquez, Jesus; Valencia, Alfonso; Tress, Michael L. (2014-11-15). “Multiple evidence strands suggest that there may be as few as 19,000 human protein-coding genes”. Human Molecular Genetics. 23 (22): 5866–5878. doi:10.1093/hmg/ddu309. ISSN 1460-2083. PMC 4204768. PMID 24939910.
- “NCI Dictionary of Genetics”. National Cancer Institute. Retrieved 13 December 2014.
- “NCBI Genetics Review”. National Center for Biotechnology Information. Retrieved 10 March 2021.