AN EXPLORATORY FACTOR ANALYSIS OF LONG COVID

,


INTRODUCTION
As of September 7, 2022, over 603.7 million cases have been confirmed with SARS-CoV-2 worldwide and there have been 6.4 million deaths worldwide [1].Many of those infected with SARS CoV-2 have not fully recovered and exhibit continuing and new symptoms -including fatigue, muscle aches, cardiac issues, and rashes [2].This continuation and sometimes the development of new symptoms, following the acute infection, has been referred to as "Long COVID" or "Post-Acute Sequelae of SARS CoV-2 infection" (PASC) [3,4].Estimates of Long COVID vary from 10% [5] to 80% [6].According to Huang et al. [7], 55% of those with COVID-19 have at least one sequelae symptom two years after infection.Based on a meta-analysis of worldwide data conducted by Chen and colleagues [8], the prevalence rates of post-COVID-19 conditions at months one, two, three and four were 37%, 25%, 32% and 49%, respectively.
Multiple studies have listed persisting symptoms of Long COVID [9,7,10], including fatigue, muscle or body aches, shortness of breath, and difficulty concentrating

INFECTIOUS DISEASES OBSERVATIONAL RESEARCH
or focusing [11].According to the Center for Disease Control and Prevention, those with symptoms that persist or newly develop after COVID-19 infection and last for four weeks or more will show symptoms in the following domains (1) general symptoms, such as tiredness or fatigue, post-exertional malaise, and fever, (2) respiratory and heart symptoms, (3) neurological symptoms, (4) digestive symptoms, and (5) and other symptoms, consisting of joint or muscle pain, rash, and changes in menstrual cycles [12].
Identifying the most common symptoms of COVID and post-COVID infection and classification systems can aid in developing criteria for accurate diagnoses.
It is still unclear how long a symptom needs to persist to be included as a Long COVID symptom [14].The World Health Organization's case definition for post COVID-19 condition [15] states that symptoms should persist three months since the onset of symptoms, and symptoms must last for two months.They also state that these symptoms must not be explainable by other medical diagnoses.A definition by NICE [16] also states that symptoms should continue for more than 3 months and must not be explained by an alternative diagnosis.In contrast, several investigators, including Sivan and Taylor [17] recommended Long COVID symptoms should persist for at least four weeks since infection.
Recent research has found that from 43% to 46% of patients with COVID-19 symptoms meet one or more Myalgia Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) case definitions [18,19,20,21].Research of ME/CFS might suggest ways to measure Long COVID symptoms.Research in the ME/CFS literature has highlighted the need for the use of a reliable and valid questionnaires to measure symptoms, to measure the frequency and as well as the severity of symptoms, and to clarify the threshold for determining if a symptom should be counted [22].
In the ME/CFS field, using statistical methods to classify patients has helped identify latent factors [23], and similar methods are beginning to be employed to identify critical domains of COVID-19 and Long COVID symptoms.For example, Guo et al [24] [18] PCA resulted in the following five components: fatigue, post-exertional malaise, daytime sleepiness (lethargy), brain fog, and unrefreshing sleep.
An exploratory factor analysis (EFA) is also useful for identifying latent variables, particularly when little is known about the construct of a disease [25].One of the first EFAs of COVID-19 was conducted by Luo and colleagues [26] consisting of 60 patients from outpatients and inpatients hospitals.An EFA resulted in a fivefactors consisting of (1) Respiratory-Digestive-Related, (2) Nervous System-Related, (3) Cough-Related, (4) Upper Respiratory Tract-Related, and (5) Digestive-Related factors.Luo and colleagues' study [26] had a small sample size as reflected in their Kaiser-Meyer-Olkin (KMO)test, which indicated an inadequate sample size for conducting a factor analysis.Furthermore, three of their five factors only had two variables.
Another EFA of COVID was conducted by Yifan and colleagues [27], and it evaluated nurses treating COVID-19 pneumonia patients.Yifan et al. [27] created a 16-item survey based on the International Classification of Functioning [28] and expert meetings.A positive aspect of their questionnaire is the use of severity and frequency scales to measure symptoms.Yifan et al. [27] ranked these symptoms and selected those present in both the top ten of frequency and severity, and those symptoms were used in the EFA, resulting in the following three factors "Breathing and Sleep Disturbance," "Gastrointestinal Complaints and Pain," and "General Symptoms."However, their sample size was small and their KMO indicated the sample was inadequate [29].
An EFA conducted by Jason and Dorri [30] involved 299 participants with Long COVID who reported their symptoms during their initial two weeks of contracting COVID and the most recent two weeks (the KMO was good for both time points).Participants were asked 54 questions from the DePaul Symptom Questionnaire (DSQ-1), a self-report measure of ME/CFS.The DSQ-1 has demonstrated high test-retest reliability, strong internal consistency, and clinically useful results [31,22,32].The questionnaire also included an additional six symptoms most reported by those with COVID.Each symptom was measured using a five-item frequency and severity scale.In another EFA study, using 5,153 hospitalized COVID participants from across Korea, Jo et al. [34] used twelve symptoms and their resulting three-factor model consisted of (1) Cough, Sputum, and Rhinorrhea (coldlike symptoms), (2) Neurological symptoms such as myalgia, fatigue/malaise and headache, and gastrointestinal symptoms such as nausea/vomiting and diarrhea, and (3) Dyspnea and Altered States of Consciousness.However, only one of their three factors had three or more variables per factor.
Finally, Hughes et al. [35]  Unfortunately, there is still little consensus between and among researchers and practitioners regarding the key symptom of Long COVID and how best to classify them.This results in multiple diagnostic criteria, which is the largest source of diagnostic unreliability [36].In the several EFA studies that have been used to develop categories of symptoms for Long COVID, most had either inadequate sample sizes, did not use reliable or valid questionnaires, or focused on the occurrence of symptoms, rather than using both the frequency and severity measures.
The current study addresses many of these methodologic and conceptual limitations.In particular, an adequate sample size [37][38][39], a comprehensive list of COVID symptoms, and an adequate ratio of the number of participants to variables [40].Furthermore, to enable a more sophisticated assessment of symptom burden, the frequency and severity of each symptom was measured, which is an improvement over most prior questionnaires that only measure symptom occurrence.

HYPOTHESES
This study sought to identify the type and number of latent factors of Long COVID and to determine the splittest reliability of the resulting EFA symptoms.

Data collection and participants
Participant recruitment occurred through social media sites.Of the 480 participants, we removed those who did not complete the survey, did not report symptoms during the second month or more since symptom onset, or had been hospitalized.The remaining participants (N = 309) were used to conduct the EFA.The mean duration since initial symptom onset was 74.04 (37.32) weeks.Participants were asked to report their symptoms over the past month using a new scale called the DePaul Symptom Questionnaire-COVID (DSQ-COVID).In addition, level of impairment of participants was assessed.

Ethics statement
IRB approval was obtained from DePaul University on April 20, 2022 (IRB-2022-590), and participant consent was obtained.

MEASURES DePaul Symptom Questionnaire-COVID (DSQ-COVID)
Information was collected with the DSQ-COVID on demographics, variant of COVID, hospitalization and vaccination status, etc. COVID-related symptoms were then asked.This symptom list was created by identifying the most common symptoms across the COVID-19 research literature.A thorough literature search used the following terms: "exploratory factor analysis of COVID", "exploratory factor analysis of Long COVID", "factor analysis of COVID", and "factor analysis of Long COVID" across several databases (DePaul University Library system, Google Search, and Google Scholar).In addition, possible symptoms were presented to patient communities for their feedback.The information patients provided was adjudicated among the research team, a final list created, and shared with patients for additional feedback.This feedback was evaluated by the research team at the Center for Community Research, DePaul University, and a decision was made on a final list of 38 symptoms [41].
The DSQ-COVID questionnaire asks participants to report both the frequency (on a 5-point scale from 0 designating none of the time to 4 indicating all of the time) and severity (on a 5-point scale from 0 designating symptom not present to 4 indicating very severe) for each symptom.The frequency and severity scores for each symptom were added and multiplied by 25 to create a composite score ranging from 0 to 100.

Outcome measure of functional impairment
Participants were asked to describe their fatigue over the last month, on a seven-point scale described elsewhere [19].Based on this scale, participants were either classified as severely impaired (bedbound or homebound), moderately impaired (able to work parttime and leave the house but did not have energy for other activities), or mildly impaired (fully functional).

Statistical analyses Removing participants and Replacing Missing Values
Participants were removed if they answered less than 90% of the DSQ-COVID symptoms For the remaining participants, missing values were replaced by using the median of responses for the symptom, which is described in detail in a publication by Jason and Dorri [41].
IBM SPSS Statistics version 28 was used for all analyses.EFA was performed on the 100-point symptom composite scores.A principal axis factoring, using extractions based on Eigenvalue greater than one and a Promax rotation (kappa = 4), was used.The resulting symptom correlations were examined and those that correlated less than 0.30 or greater than 0.90 were removed.This process was repeated until all symptoms correlated with one or more other symptoms.Next, the Pattern Matrix from the analysis was evaluated to assess symptom loadings onto factors.Symptoms that loaded less than 0.30 were removed.This process was repeated until each symptom loaded onto a factor.The scree plot was then examined to locate the inflection point so to determine the number of factors to retain.This number was used as a fixed factor and the resulting Pattern Matrix was examined and symptoms loading under 0.30 were dropped.This process was repeated until each symptom loaded onto a factor.The resulting matrix indicated the final number of items to be retained in the EFA.A split-half test of reliability was then conducted for the resulting factor construct and for each factor domain.

RESULTS
The mean duration since initial COVID symptoms was 74.0 (SD=37.3)weeks.The mean age was 48.3 (SD=12.03)years old.Furthermore, 83.7% of participants identified as female, and most participants lived in North America (70.6%) and Europe (20.7%).(See Table 2).

Exploratory factor analysis and split-half reliability
Bartlett's test of sphericity was significant [χ 2 (435)=3996.81,p<0.00], indicating that the correlation matrix was not an identity matrix.The KMO measure of sampling adequacy (0.88) indicated that the matrix was appropriate for EFA.The scree plot suggested a threefactor model, and each factors had an Eigenvalue of greater than one.This model explained 35.9% of the variance (see Table 4).
Factor one was labeled "General," because it consisted of various symptom domains from the heart to musculoskeletal problems, explained 27.03% of the variance, with loadings for its 16 symptoms ranging from 0.63 to 0.31.Factor two was labeled "PEM/Fatigue/Cognitive Dysfunction," and it explained 5.48% of the variance with its 6 symptoms having loadings ranging from 0.82 to 0.37.Finally, Factor three was labeled "Psychological," and it explained 3.41% of the variance with its 3 symptom loadings ranging from 0.87 to 0.76.
A measure of questionnaire reliability was conducted with a split-half reliability test on the DSQ-COVID, and it resulted in .92.Furthermore, a split-half reliability test for the twenty-five EFA items resulted in .90.A split-half reliability test for factors one, two, and three resulted in .86,.79,and .85,respectively.

Differences among impairment groups for factor scores
For the three impairment groups (i.e., mildly, moderately, and severely impaired), there was a significant ANOVA difference for factor one scores [F(2, 331)=14.70,p<.001].Post-hoc comparisons with Bonferroni corrections found that the severely impaired and moderately impaired groups scored significantly worse than the mildly impaired group.Factor two also revealed a significant difference among the three impairment groups [F(2, 331)=37.98,p<.01].The severely impaired group scored significantly worse than the moderately and mildly impaired groups, and the moderate group scored significantly worse than the mildly impaired group.There was not a significant difference for factor three scores (Psychological) [F(2, 331)=1.30,p=.27].

DISCUSSION
As reviewed in the introduction, most prior EFAs for Long COVID have either had small samples, used questionnaires with unclear psychometric properties, or used simplistic measures of occurrence rather than more refined measures tapping both frequency and severity of symptoms.The current study attempted to correct these shortcomings and found latent factors which might provide researchers and clinicians insight into the construct of Long COVID.
The first factor, a broad category labeled General, may have resulted due to the diverse type of symptoms that are present months after infection.These symptoms range across several organ systems.Three of the eight prior EFA or PCA Long COVID studies (See Table 5) found a comparable symptom domain [7,30,27].Two of these studies [7,30] used a comprehensive list of symptoms.This factor is also compatible with the CDC's [42] categories of symptoms for Long COVID.It is possible that a general symptoms latent factor is indicative of underlying pathophysiology of the disease due to COVID infecting so many different biological systems in patients.

Regarding
the second factor labeled PEM/Fatigue/Cognitive Dysfunction, four of the eight prior studies found one or more of these symptom clusters (See Table 5).The latent factor also makes conceptual sense because fatigue and post-exertional malaise cause cognitive impairment, and these types of symptoms are common among those with ME/CFS, which is present in approximately 50% of patients with COVID-19 symptoms [18][19][20][21].For this factor, those who are severely impaired by Long COVID appear to have more difficulties than those who are only moderately/mildly affected.
The third factor (Psychological) was present in only two of the eight studies (See Table 5).This is likely due to only a few of these studies measuring symptoms of psychological distress.When they did, there were only one or two psychological symptoms included in their total list of COVID symptoms.This latent factor points to possible mental health challenges worsening or resulting from the long-term impact of Long COVID.This factor was not significantly different across the three levels of impairment (mildly, moderately, and severely), suggesting that these symptoms may occur as a result of living with the social and economic burdens of COVID-19 symptoms.The prevalence of depression and anxiety worldwide are reported to be 24 and 21 percent [43].The uncertainty, and in some cases loss of hope in treatment and health outcomes, can result in or exacerbate psychological distress [44].Certainly, psychological challenges are common after local or global crises, such as economic and natural disasters [45].
There are several limitations in the current study.For example, the majority of our sample was located in North American countries and were women, which potentially limits the generalizability of the findings.Participants varied in the duration of symptoms and this might have affected the findings.In addition, as with all self-report surveys, recall bias might have occurred in reporting symptoms.
Future studies are needed to assess the trajectory of symptoms.This research will provide greater insights into the longer-term pathophysiology of Long COVID and its multifaceted burden on individuals and society so to effectively treat and prevent further deleterious effects.There is also a need to differentiate the findings based on SARS-CoV2 variant and vaccination status.Additional insight into this disease can be gained from comparing hospitalized and non-hospitalized patients.These analyses will be explored in future publications.Finally, studies assessing patient mental health, social support, and perceived stigma can help researchers and practitioners to better address all the factors associated with Long COVID.Understanding the experiences and day-to-day functioning of those with this disease holds the promise of improving the quality of life for all people.
Time one shortly after infection revealed a three-factor model consisting of Cognitive Dysfunction, Autonomic Dysfunction, and Gastrointestinal Dysfunction.At time two, a mean of 21.7 weeks following infection, a three-factor model emerged consisting of Cognitive Dysfunction, Autonomic Dysfunction, and Post-Exertional Malaise.Pinto and colleagues [33] recruited 5,136 patients from across the U.S. experiencing Long COVID.They used an EFA on half the sample and a confirmatory factor analysis (CFA) on the other.The EFA resulted in a five factor model, consisting of (1) Cold and Flu-like symptoms, (2) Change in Smell and/or Taste, (3) Dyspnea and Chest Pain, (4) Cognitive and Visual problems, and (5) Cardiac symptoms.Their CFA demonstrated a strong fit.The symptoms used in this study were created based on content analysis of unstructured data of Facebook posts of patients describing their symptoms.