The study findings were published in the peer reviewed journal: Nature Medicine.
Individuals with confirmed SARS-CoV-2 infection were at increased risk of reporting a wide range of symptoms at ≥12 weeks after infection, compared to propensity score-matched patients with no record of suspected or confirmed SARS-CoV-2 infection, after accounting for both sociodemographic and clinical characteristics and the reporting of symptoms before infection.
The symptoms most associated with SARS-CoV-2 infection included some that are already recognized in previous studies12, such as anosmia, shortness of breath, chest pain and fever, but also included a range of other symptoms that have previously not been widely reported such as hair loss and sexual dysfunction.
Previous SARS-CoV-2 infection was independently associated with the reporting to primary care of 20 of the 33 symptoms included in the WHO case definition and an additional 42 symptoms, beyond 12 weeks from infection. SARS-CoV-2 infection was associated with a 26% relative increase in risk of reporting at least one of the symptoms included in the WHO case definition for long COVID.
Among those with a history of confirmed SARS-CoV-2 infection, several risk factors were associated with reporting symptoms 12 weeks or more after infection. Female sex, a gradient of decreasing age, belonging to a Black, mixed ethnicity or other ethnic minority group, socioeconomic deprivation, smoking, high BMI and the presence of a wide range of comorbidities were associated with increased risk of both symptoms included in the WHO definition of long COVID and symptoms statistically associated with SARS-CoV-2 infection reported 12 weeks or more after infection.
Among those with a confirmed SARS-CoV-2 infection and who reported at least one symptom that was statistically associated with SARS-CoV-2 infection at least 12 weeks after infection, three major clusters of phenotypes of long COVID were observed. These included patients with symptoms dominated by (1) a broad spectrum of symptoms, including pain, fatigue and rash (80.0%); (2) respiratory symptoms, including cough, shortness of breath and phlegm (5.8%); and (3) mental health and cognitive symptoms, including anxiety, depression, insomnia and brain fog (14.2%).
A key strength of the study is the large sample size, which included 486,149 adults with a confirmed diagnosis of SARS-CoV-2 infection and 1.9 million propensity score-matched patients with no recorded evidence of SARS-CoV-2 infection. The large sample size provided adequate statistical power to assess differences in the reporting of a wide range of symptoms between the two cohorts and estimation of the association between reporting of symptoms and important sociodemographic and clinical risk factors with a high level of precision.
Another key strength of the study is the inclusion of a comparator group that did not have either suspected or confirmed SARS-CoV-2 infection and had been propensity score-matched for sociodemographic factors, previously reported symptoms and over 80 comorbidities.
This enabled us to assess the independent association between exposure to SARS-CoV-2 and the reporting of symptoms ≥12 weeks after infection, after accounting for many important confounders. A further strength is the large number of symptoms included in the analysis, which was based on a previous systematic review of the literature11, a scoping review of long COVID questionnaires and an extensive consultation with patients and clinicians20].
Symptom code lists were developed rigorously with systematic searches for relevant SNOMED CT codes with extensive clinical input. We also assessed the outcome of long COVID using the WHO case definition as well as a new definition that incorporated symptoms that were statistically associated with a history of SARS-CoV-2 infection.
A key limitation of the study is the use of routinely coded healthcare data. Coded symptom data in primary care records is likely to underrepresent the true symptom burden experienced by individuals with long COVID. This could be due to reduced access to primary care (especially during the first surge of the pandemic), patients not consulting their general practitioner (GP) about symptoms or the reason for the GP consultation being unrelated to COVID-19, thereby leading patients to underreport the full extent and breadth of their symptoms.
In addition, much of a patient’s clinical history, in terms of the symptoms reported, are recorded as free text, rather than as SNOMED CT codes21. The symptom data we used for the study thus cannot be used to make inferences about the absolute prevalence of these symptoms; however, as this underrepresentation would be expected to affect both the infected and propensity score-matched comparator cohorts equally, the data used in the present analysis can still be used to examine relative differences in the reporting of symptoms between patients infected with SARS-CoV-2 and patients with no recorded evidence of SARS-CoV-2 infection.
Conversely, with the evolving awareness of long COVID, it is possible that patients with a history of COVID-19 may have been more likely than those without to access primary care and alert clinicians of their symptoms, which could potentially lead to an inflation of the observed effect sizes.
This is potentially supported by the increased aHRs observed for symptoms such as cough, sneezing, fever and allergies among patients who were infected during the second surge of the pandemic, compared to those infected during the first surge, although this could also potentially be attributed to other reasons, such as changes in the dominant variants.
Another limitation of the study is potential misclassification bias. Community testing for SARS-CoV-2 was very limited during the first surge of the pandemic, and many hospitalized individuals who were not hospitalized with COVID-19 were not tested. Furthermore, antigen test positive results may not be routinely coded within primary care.
There is some evidence that as much as 20–30% of SARS-CoV-2 test positive cases may be missing from primary care records22,23. It is therefore possible that some members of our propensity score-matched comparator cohort had been infected with SARS-CoV-2 but had simply not been tested or coded as confirmed COVID-19 within primary care.
We attempted to account for this bias by excluding individuals from the comparator cohort if they had a coded diagnosis of suspected COVID-19; however, this is unlikely to be completely sensitive in identifying individuals with unverified SARS-CoV-2 infection from the comparator cohort, which would potentially have the effect of attenuating the observed effect sizes.
Similarly, it is possible that some members of our cohort were hospitalized, as we were limited to using SNOMED CT codes for hospitalization within primary care records rather than using linked Hospital Episode Statistics data, of which timely access was unavailable for our study.
Finally, we were unable to incorporate all aspects of the WHO clinical case definition for long COVID, such as ‘impact on everyday functioning’ due to the lack of data on these domains within coded primary care data. Our findings support the results from our previous systematic review and meta-analysis on long COVID symptoms 11.
That review found the most prevalent symptoms to be fatigue, shortness of breath, muscle pain, joint pain, headache, cough, chest pain, altered sense of smell, altered taste and diarrhea.
Our current analysis was not able to assess symptom prevalence but rather the relative difference in symptoms between a large sample of individuals with and without recorded evidence of SARS-CoV-2 infection at ≥12 weeks after infection. We similarly identified anosmia, shortness of breath, fatigue and chest pain to be symptoms significantly associated with SARS-CoV-2 infection.
By contrast, we also identified new symptoms such as hair loss, sneezing, symptoms of sexual dysfunction (difficulties ejaculating and reduced libido), hoarse voice and fever as significantly associated.
Also, like our review 11, we found that female sex and the presence of a range of comorbidities were associated with an increased risk of developing persistent symptoms; however, it is likely that pre-existing comorbidities may have influenced the likelihood of GP consultations and symptom reporting.
In contrast to our review, the present analysis found that risk of reporting symptoms at ≥12 weeks after infection increased along a gradient of decreasing age in our cohort. This could partly be due to the adjustment for an extensive range of comorbidities or the differences in the populations studied. Most studies included in our review were based on hospitalized cohorts, whereas our present study excluded hospitalized patients.
Older patients with COVID-19 were more likely to be hospitalized than younger patients and, therefore, to be excluded from our study. Older non-hospitalized patients might, therefore, have had mild disease with low symptom burden.
We also found that patients from Black, mixed ethnicity and other minority ethnic backgrounds were at increased risk of persistent symptoms. This contradicts the findings from the analysis of the COVID-19 Infection Survey data, which found a lower prevalence of long COVID among all ethnic minority subgroups compared to those of white ethnicity24; however, the COVID-19 Infection Survey analysis included children, was restricted to those living in private residences and considered self-reported diagnosis of long COVID, defined as unexplained persistence of symptoms, 4 weeks after SARS-CoV-2 infection.
An international online cohort study of people with confirmed and suspected long COVID found that respondents reported an average of 56 symptoms across an average of nine organ systems8. A Norwegian prospective study of 312 home-isolated patients found persistent symptoms 6 months after infection 25.
Both studies were comprehensive analyses of symptom burden but lacked a control group and were therefore unable to make strong inferences about the relative contribution of SARS-CoV-2 infection to these symptoms over and above pre-existing health conditions or psychosocial effects related to the pandemic; however, like these studies, we also found that individuals with a history of confirmed SARS-CoV-2 reported a broad range of symptoms, with a total of 62 symptoms being associated at 12 or more weeks after infection. We were also able to control for potential confounders, including whether the symptoms of interest were reported before infection.
The COVID Symptom Study provided data on self-reported symptoms among participants enrolled on an app16. Among those with symptoms persisting 28 d or longer after infection, key symptoms included fatigue, headache, dyspnea and anosmia, which were all also significantly associated at ≥12 weeks in our cohort.
The COVID Symptom Study also found that long COVID was associated with increasing BMI and female sex, which is in keeping with our findings; however, the study also found that the risk of reporting long COVID symptoms increased with age, whereas our study observed the opposite trend after adjustment for a comprehensive range of potential confounders. Although the COVID Symptom Study is community-based, it includes individuals with a history of hospitalized and non-hospitalized COVID-19, so the reasons for the discrepant age trend may be due to the exclusion of older patients in our study who are more likely to be hospitalized.
One of the largest population-based surveys on COVID-19 and long COVID is the UK Office for National Statistics COVID Infection Survey26. This survey estimated that as of 7 April 2022, 1.7 million people living in private households in the UK (2.7% of the population) were experiencing symptoms persisting beyond 4 weeks from SARS-CoV-2 infection and with 70% experiencing symptoms beyond 12 weeks.
Fatigue, shortness of breath, anosmia and difficulty concentrating were the main symptoms reported. The prevalence was greatest in females, those from more socioeconomically deprived areas, people working in health and social care and individuals living with health conditions and disabilities.
Our analysis showed similar symptoms, including cognitive effects, as well as similar risk factors; however, we were unable to assess the association between occupational status and reporting of symptoms due to a lack of occupational data in UK primary care records.
Whittaker and colleagues undertook an analysis of 456,002 patients with COVID-19 in England using the Clinical Practice Research Datalink (CPRD) Aurum database to determine the rates of GP consultations for post-COVID-19 sequelae 27.
This analysis included both hospitalized and non-hospitalized patients and two control groups consisting of patients without COVID-19 and those with influenza before the pandemic. Patients with COVID-19 managed in the community were significantly more likely to consult for loss of taste or smell and other symptoms such as joint pain, anxiety, depression, abdominal pain and diarrhea at ≥ 4 weeks after infection compared to 12 months before infection.
They also found that GP consultation rates for symptoms, prescriptions and healthcare use were mostly reduced in those who were managed in the community after the first COVID-19 vaccination dose; however, this study investigated only 23 symptoms based on the NICE 2020 guidelines 4 on managing the long-term effects of COVID-19, whereas in our study, we investigated 115 symptoms derived from a systematic assessment of previous studies and discussions with patients with lived experience of long COVID and clinicians11.
We were unable to estimate the effect of vaccination and infection year on long COVID symptoms in our study due to the very short follow-up period among those vaccinated and infected in the year 2022 (median 8 (IQR 4–14) and 12 (7–16) days, respectively) compared to those unvaccinated and infected in the year 2021 (33 (16–77) and 64 (31–90) days, respectively).
Furthermore, the majority (81%) of patients vaccinated before infection in our cohort were infected with SARS-CoV-2 within 2 weeks of vaccination, which would be before acquiring immunity from vaccination, thus restricting the validity of our data to assess the effects of vaccination on long COVID.
Further research is needed to estimate the prevalence of persistent symptoms associated with SARS-CoV-2 infection among patients presenting to primary care. Much of the symptom data in primary care records is held in free-text entries rather than as clinically coded data. Natural language processing could be used to leverage these textual data to gain more accurate estimates of the prevalence of these symptoms.
The 50 consolidated symptoms that were found to be associated with SARS-CoV-2, 12 weeks after infection in our study, were clustered into three phenotypes with varying risk factors. Further research is needed to confirm the identified clusters using prospective and routinely recorded patient-reported symptom data. This analysis would allow for assessment of whether clinical outcomes and the underlying pathophysiology differ between these subgroups and potentially develop targeted therapies for the different phenotypic subgroups.
There is also a need to obtain patient-reported data on symptoms and assess the association between symptom burden, quality of life and work capability to ascertain which symptoms have the greatest impact on individuals. Finally, there is a need to understand the natural history of long COVID by assessing symptom burden serially over time in a population-representative cohort with a history of COVID-19 alongside a matched control population.
Infection with SARS-CoV-2 is independently associated with the reporting of 62 symptoms spanning multiple organ systems 12 weeks or longer after infection. A wide range of both sociodemographic and clinical factors are independently associated with the development of persistent symptoms.
Additional research is needed to describe the natural history of long COVID and characterize symptom clusters, their pathophysiology and clinical outcomes. Further research is also needed to understand the health and social impacts of these persistent symptoms, to support patients living with long-term sequelae and to develop targeted treatments.
Pathophysiological mechanisms in long-COVID or post-COVID syndrome. Based on the current knowledge, mechanisms that are involved in long COVID are complex and interrelated. Three major categories of the pathophysiological changes are: (1) Direct cellular/tissue injury caused due to cytotoxicity or by hijacking host metabolic machinery such as mitochondrial functioning or methyl group transfer; (2) Immune activation and inflammation, this can either target the host cells through antigen cross-reactivity or induce cell damage due to inflammatory changes including cytokines/chemokines and cellular infiltrations; (3) Counter physiological response corresponds to altered hormonal changes or responsive intracellular signaling pathways. The combination of the above mechanisms (upper panel, purple boxes) and depending on the viral tissue-tropism and microenvironment organ-specific pathophysiological changes are responsible for the respective clinical symptoms (lower panel, in blue shaded boxes). Abbreviations: C-Reactive Protein (CRP); Interferon gamma (IFN-γ); Tumor necrosis factor-α (TNF-α); Interleukin-1β (IL-1β); Interleukin-1 (IL-1); Interleukin-6 (IL-6); Interleukin-8 (IL-8); Matrix metalloproteinase-7 (MMP-7); Hepatocyte growth factor (HGF); GPCR—G-protein coupled receptors; von Willebrand factor (vWF);); Thyroid stimulating hormone (TSH); Triiodothyronine (T3); Adrenocorticotropic hormone (ACTH); Hypoxia-inducible factor 1α (HIF-1α); Reactive oxygen species (ROS); Chemokine (C-X-C motif) Ligand (CXLC-2, CXCL-8 etc.)