INTRODUCTION
Coronavirus disease 2019 (COVID-19) is an infectious disease caused by new coronavirus type 2 severe acute respiratory syndrome (SARS-CoV-2), identified for the first time in December 2019 within an atypical pneumonia outbreak in the city of Wuhan, China1. At the end of January 2020, the outbreak became a global health emergency2-6, and two months later, was declared a pandemic disease6.
During the early stages of a new disease outbreak, there was a limited number of tests and infrastructure to apply them, therefore it was not possible to have a confirmatory test for every individual4,5. For this reason, clinical and epidemiologic information are useful to foretell the presence of a probable case7. Among several factors that determine the counting of cases of COVID-19, the definition for a case is essential. To give an example, in Hubei Province in China, for a short period of time the definition was extended in such a way that it could include ‘clinical confirmed’ cases without the need of laboratory testing7,8.
At present, reverse transcriptase-polymerase chain reaction (RT-PCR) is the gold standard for COVID-19 confirmatory diagnosis9, although there are other complementary tests or methods that can be used in particular cases, like CT scan10-12. In Mexico, the institutions that conform to the National Committee for Epidemiologic Surveillance (CONAVE) carry out tests according to case definitions and severity of symptoms. Only sampling 10% of suspected cases with mild symptoms were subjected to ambulatory management, while suspected cases with severe symptoms or those that fulfill definition for severe acute respiratory infection (SARI), theoretically, have a 100% test rate13.
Case definitions vary from country to country and during different stages of the pandemic. An international standardized case definition would allow comparisons in the number of cases around the world5,8,14. In Mexico, the CONAVE released a new case definition on 24 March 2020 through an official notice where it established that a suspected case was defined as a ‘person of any age that in the last 7 days has presented at least two of the following signs and symptoms: cough, fever and headache (cardinal symptoms), accompanied by at least one of the following signs or symptoms: dyspnea, arthralgia, myalgia, odynophagia or pharyngeal discomfort, rhinorrhea, conjunctivitis and thoracic pain’. While a confirmed case is ‘someone that fulfills the operational definition of suspected case and has a confirmatory test by the National Network of Public Health Laboratories recognized by the Institute of Epidemiological Diagnosis and Reference’13. However, it is relevant to mention that this case definition is no different from the one used for epidemiological surveillance since the influenza pandemic in 200915-17.
It is noted that definitions that include epidemiological history and other specific criteria, decrease sensibility, in contrast to definitions that intend to include a broader spectrum of the disease. With a new disease like COVID-19, defining the criteria from specific symptoms is limiting, on account of some asymptomatic patients and others debuting with a different clinical set of symptoms7.
Therefore, the objective of this study is to evaluate the effectiveness of the 24 March COVID-19 case definition for the identification of confirmed SARS-CoV-2 infections in the Mexican Institute of Social Security (IMSS) users in Tijuana, Mexico. As secondary analyses, an assessment of the association between the most frequent symptoms related to COVID-19 and test results were performed.
METHODS
Study design
A cross-sectional database study was conducted using data from the IMSS’s Epidemiologic Surveillance Online Notification System (SINOLAVE). It includes information from all patients that presented to seek medical attention at IMSS for reasons related to COVID-19, whether they had any type of symptom or an epidemiological nexus to a confirmed or suspected case. The data for the study were extracted on 11 May 2020 and corresponded to the entries recorded from 11 March to 1 May 2020. The data extraction criteria from SINOLAVE database were subset records from the Baja California delegation. As this was secondary research from an institutional database for epidemiological surveillance, it was exempt from IRB review.
Data
The database consists of the following items: patient ID, registry date, symptoms starting date, clinical history including presence or absence of 16 signs and symptoms (fever, cough, headache, odynophagia, malaise, myalgia, arthralgia, rhinorrhea, chills, abdominal pain, conjunctivitis, dyspnea, cyanosis, diarrhea, thoracic pain, polypnea), personal medical history (including chronic disease, tobacco smoking, alcohol consumption and pregnancy status, as well as history of travel and contact with COVID-19 cases and/or animals), results from RT-PCR for SARS-CoV-2 from nasopharyngeal or oropharyngeal swabs, treatment and outcomes from primary and secondary healthcare systems. Data were recorded in a way that the identity of the patients could not be ascertained.
Participants
The database was cleaned to only include patients of any age registered in Tijuana, Mexico, which corresponded to those notified from IMSS primary care units 7, 18, 19, 27, 33, 34, 35 and 36, and secondary care units 1 and 20. Individuals without a complete personal and clinical history were excluded, patients without a confirmatory test result for SARS-CoV-2 were excluded as well; duplicated or triplicated entries were eliminated, keeping the first chronological record or the one were SARI criteria were met if it was registered at the same healthcare level; if it was reported by different healthcare levels, the entry from the highest healthcare level was kept; all records with laboratory test results were included.
Variables
A new variable was computed from the 16 signs and symptoms recorded to determine if the patients included in the study fulfilled the criteria of the latest operational definition for suspected cases of COVID-19 (2/3 cardinal symptoms + at least 1 additional symptom). The individual 16 signs and symptoms were also used as independent variables by themselves, as well as their multiple combinations. To evaluate differences in severity of presentation, mild cases were defined as those that fulfilled COVID-19 suspected case definition but did not present with dyspnea, thoracic pain or polypnea, whereas severe cases were defined as those that required hospitalization or presented with dyspnea, fever and cough and at least one of the following: malaise, thoracic pain or polypnea. The RT-PCR result for SARS-CoV-2 from nasopharyngeal or oropharyngeal swabs was the dependent variable.
Statistical analysis
Descriptive statistics were used to characterize the study population. COVID-19 case definition was compared to RT-PCR results using 2×2 tables to estimate sensitivity, specificity, PPV and NPV, as well as likelihood ratios (LR). Categorical variables were analyzed using χ2 test for bivariate analysis and the Mantel-Haenszel test to control for confounding, stratifying by gender, age group, history of chronic disease and severity of presentation. Statistical analysis was conducted using IBM SPSS Statistics (Version 25). Statistical significance was considered as a p<0.05.
RESULTS
From a total of 10216 entries from SINOLAVE subset data source (Figure 1), 897 were analyzed after excluding 9319 that did not satisfy inclusion criteria (3858 did not belong to Tijuana, 72 were repeated, 30 had incomplete information and 5359 did not have a SARS-CoV-2 test result).
From the 897 included, 558 (62.2%) had a positive result and 339 (37.8%) had a negative result for SARS-CoV-2 (Table 1). The median age of participants was 45 years (SD=16) and the range was 0-88 years; 483 (55%) were male, while 404 (45%) were female. The distribution by age group was 47% for 40-59 years, followed by 16-39 years, ≥60 years, the 0-5 years and finally the 6-15 years group with 34.8%, 15.6%, 1.5% and 1.1%, respectively. Regarding medical history, 541 (60.3%) did not have any chronic diseases, while 356 (39.7%) did. Among participant’s chronic diseases, hypertension stood out with a prevalence of 29.8% in COVID-19 confirmed cases and 20.8% in ruled-out cases, followed by diabetes with 23.5% vs 14.8% and obesity with 14.5% vs 13.6% of prevalence between confirmed and ruled out cases, respectively. Additionally, the group with positive results included 6 pregnant patients, compared with the negative group with 5. Also smoking was prevalent in 8.1% of SARS-CoV-2 cases and in 6.2% of ruled-out cases.
Table 1
Using the frequencies shown in Table 2, a sensitivity of 87.45%, specificity of 10.61%, a PPV of 61.69% and a NPV of 33.96% were calculated for COVID-19 suspected case definition. This resulted in a diagnostic accuracy of 58.42%, LR+ was 0.98 and LR- was 1.18. A Cohen’s kappa of -0.022 showed there is no strength in the agreement between case definition and confirmatory RT-PCR test. There is no significant association between COVID-19 suspected case definition and RT-PCR for SARS-CoV-2 (χ2=0.750, df=1, p=0.386).
Table 2
Fulfill case definition | SARS-CoV-2 positive n (%) | SARS-CoV-2 negative n (%) | Total n (%) |
---|---|---|---|
Yes | 488 (87.45) | 303 (89.39) | 791 (88.9) |
No | 70 (12.55) | 36 (10.61) | 106 (11.1) |
Total | 558 | 339 | 897 |
[i] Sensitivity = 488/558 = 0.8745 (87.45%). Specificity = 36/339 = 0.1061 (10.61%). Accuracy = (488 + 36)/897 = 0.5842 (58.42%). Positive predictive value (PPV) = 488/791 = 0.6169 (61.69%). Negative predictive value (NPV) = 36/106 = 0.3396 (33.96%). Positive likelihood ratio (LR+) = 0.8745/(1 - 0.1061) = 0.98. Negative likelihood ratio (LR-) = (1 - 0.8745)/0.1061 = 1.18.
In general, the most common symptoms among patients were cough (82.6% vs 84.4%), fever (82.1% vs 74.6%) and headache (69.9% vs 72.6%) in the positive and negative SARS-CoV-2 groups, respectively (Table 3). Furthermore, fever and cough, each represented 89.0%, were the most frequently reported symptoms in patients that satisfied the suspected case definition and had a positive result for SARS-CoV-2. Meanwhile, for the cases that did not meet the criteria of COVID-19 case definition, the most frequent symptom was dyspnea, with 71.6% of RT-PCR confirmed cases and 45.5% of ruled-out cases. Dyspnea was present in 76.9% of all confirmed cases.
Table 3
After individually analyzing each of the 16 signs and symptoms, only 4 had a statistical association with a positive result of RT-PCR for SARS-CoV-2 (Supplementary file, Table S1). These symptoms were dyspnea (χ2=43.706, df=1, p<0.001), odynophagia (χ2=26.373, df=1, p<0.001), rhinorrhea (χ2=20.970, df=1, p<0.001) and fever (χ2=7.117, df=1, p=0.008). An exploratory analysis was conducted from these findings to determine the association between a confirmatory test result and the COVID-19 suspected case definition combinations (Table 4). We observed that using these combinations, the association between the new definitions and the RT-PCR result for SARS-CoV-2 was statistically significant when adding dyspnea, odynophagia and rhinorrhea to the definition and in all possible combinations of the cardinal symptoms, except cough + headache + at least 1 other. No combination showcased a superior sensitivity and specificity profile than the analysis of COVID-19 case definition by itself, although a high specificity was observed when adding cyanosis, polypnea (98.23% for both), conjunctivitis and abdominal pain (93.21% for each).
Table 4
To control for confounding, the Mantel-Haenszel test was performed stratifying by age group, gender, severity of disease and history of chronic disease. We found age group to be a possible confounding factor with an OR homogeneity of 0.024 according to the Breslow-Day test (ORMH=0.783, p=0.042, 95% CI: 0.618-0.991), given a statistically significant association between COVID-19 suspected case definition and RT-PCR in the 16-39 years group (χ2MH=6.003, df=1, p=0.014), while a borderline association (χ2MH=3.846, df=1, p=0.05) was observed in the 40-59 years group. The association was conserved when stratifying for gender with an OR homogeneity of 0.662 (ORMH=0.786, p=0.206, 95% CI: 0.542-1.140) and history of chronic disease with a homogeneity of 0.422 (ORMH=0.821, p=0.255, 95% CI: 0.584-1.153), and severity of cases with a homogeneity of 0.097 (ORMH=0.949, p=0.797, 95% CI: 0.669-1.346).
DISCUSSION
Although suspected case operational definition is the most feasible clinical tool for identifying probable cases of COVID-19 during the pandemic, there is no statistically significant association between fulfilling case definition criteria and RT-PCR result for SARS-CoV-2. Moreover, suspected case definition has a sensitivity of 87.45% and specificity of only 10.61%, with a diagnostic accuracy of 58.41%. Additionally, the most frequent symptoms among COVID-19 confirmed cases were cough, fever and headache, similar to those reported by Huang et al.12 and Chen et al.18 in China. However, these same symptoms were also the most common among the negative SARS-CoV-2 by RT-PCR group. Dyspnea was the most frequent symptom in those who did not fulfill the case definition but had a positive result for SARS-CoV-2 and was present in 76.6% of all confirmed cases, this is consistent with the findings of Bhartraju et al.19 in the United States, where dyspnea and cough were present in 88% of confirmed cases. Only fever, rhinorrhea, odynophagia and dyspnea had a statistically significant association with a confirmatory test result. However, stratifying by age group showcased that age may be a confounding factor, given that age group from 16-39 years had statistical significance and the 40-59 years group had a borderline statistical significance. This suggests a possible association between different clinical presentations of COVID-19 and age.
To evaluate the COVID-19 operational definition, first we have to take into consideration that a desirable characteristic for an epidemiologic surveillance detection strategy, such as the case definition, has high sensitivity. In this case, the sensitivity is 87.45%, therefore it can be assumed that it is limited to identify individuals with the disease. However, with a low specificity, the false positive rate is greatly increased. In Mexico, where confirmatory tests are not applied to all identified suspected cases, this situation is problematic in clinical practice and places a bigger burden of COVID-19 on health systems when measures to control further transmission of the disease, such as sick leave, are applied indiscriminately to the economically productive population. Even though it was not possible to establish a statistically significant relation between COVID-19 case definition and confirmatory test results, our findings do not prove that the 24 March case definition is inappropriate. The development of an operational definition is a daunting task. As was observed, none of the case definition combinations tested in this study had a likelihood ratio profile that showcased to be a powerful predictor of SARS-CoV-2 infection, nor did they have higher sensitivity and specificity.
Case definitions in this kind of epidemiological setting are very dynamic. The previous, operational definition for suspected case in Mexico included the criteria of travel history or contact with a suspected or confirmed case, while this COVID-19 case definition, which has been in force since the start of community transmission scenario, only includes signs and symptoms regardless of any epidemiological nexus but no major changes had been made since 24 March 2020 until the submission of this manuscript. The statistically significant association between individual signs and symptoms not featured as cardinal in the 24 March case definition (rhinorrhea, odynophagia and dyspnea), suggests that other signs and symptoms could be of greater value to improve the epidemiological surveillance of COVID-19 in the future. Considering that this case definition is the same definition for epidemiological surveillance of Influenza-like Illness (ILI) used since the 2009 pandemic16, and with newly published evidence by authors around the world describing new or ‘rare’ clinical manifestations20-22, it could be considered that COVID-19 has a different clinical presentation from influenza. For example, the database for this study did not include other symptoms like anosmia and ageusia at the time of data extraction, but these have been described as common findings in COVID-19 patients by Vaira et al.23 in Italy. Hence the importance of evaluating the effectiveness of case definitions in order to improve them. In our exploratory analysis, using the 24 March definition and adding individual symptoms that were statistically significant associated with SARS-CoV-2 test results, it was found that the associations between these new definitions and RT-PCR results were statistically significant. For that reason, it would be worth evaluating the addition of other cardinal and noncardinal symptoms to the definition.
A key component of the case definition is the inclusion criteria of ‘person of any age’. However, our analysis identified age as a confounding factor, since the definition was statistically significant after adjusting for age group, especially for the 16-39 years group and as the significance is borderline for the 40-59 years group. Possibly, the lack of statistical significance was found for the ≤15 years and ≥60 years groups because atypical manifestations are often found at both ends of the age spectrum24,25. Although, this definition considers not to expect fever as a criterion for people aged ≥60 years and suggests irritability as a substitute for headache in children aged ≤5 years, more studies are needed to evaluate specific factors related to this age groups, considering that in our study these are underrepresented, with children and elderly patients only accounting for 2.6% and 15.6% of the study population, respectively. Although it could be counterintuitive to generate more than one case definition for each specific age group, it is important to characterize specific clinical features for each one of them since the definition can only predict SARS-CoV-2 infection in some age groups in this study population.
Limitations
The limitations of this study are inherent to the design itself, considering that the data used were not created with the objective to answer our investigation question. Given that the entries in the database used were heterogeneous in varying degrees, it implies that errors in categorization could be made, it also lowered the number of patients that fulfilled the inclusion criteria and ultimately decreased the size of our study population. This is a cross-sectional study design, therefore some data, specially those related to outcomes, could be missing. More than half of the study population were severe symptomatic cases, and given that testing was systematically performed in most patients with severe symptoms and all hospitalized cases, almost two-thirds of those patients with a confirmed SARS-CoV-2 infection were severe cases. In contrast, more than half of the negative result population in this study were mild cases. Therefore, an impending risk of selection bias is related to the exclusion of those patients without a RT-PCR confirming test and the fact that the data source only included patients with symptoms or those who had contact with suspected or confirmed cases, thus our study population also did not consider the roughly 90% of mild symptomatic cases that did not get tested and the proportion of asymptomatic patients with COVID-19, which recently has been reported to be up to 45%26. On the other hand, even if RT-PCR is the current gold standard confirmatory test for COVID-19, the sensitivity and specificity values of this test vary according to the anatomical site sampled9, for that reason patients also could be incorrectly classified.
Tijuana is the busiest US-border crossing city in Mexico and has population dynamics very different from those of other states in the country and, although the IMSS system is the second largest public healthcare system with a 36.4% coverage of the Mexican population27, its users are mainly employees working in private companies and their families, thus generalizations should be taken cautiously.
To our knowledge, there are no published studies that have evaluated the case definitions for COVID-19 in Mexico. However, after analyzing different case definitions for infectious diseases like SARS, some authors have concluded that it is always possible to improve and increase diagnostic accuracy by adding laboratory tests or specific symptoms to current definitions28-30.
CONCLUSIONS
In this cross-sectional database study, it was found that the 24 March COVID-19 case definition did not show a significant association, nor is it a powerful predictor for SARS-CoV-2 infection among IMSS Tijuana users in Mexico. The most common symptoms in this population were cough and fever. We observed that symptoms like rhinorrhea, odynophagia and dyspnea were identified to have a greater association with test results, and that this COVID-19 case definition better identifies those cases within the 16-39 years group, suggesting a possible relation between the type of the clinical presentation and age. Nevertheless, more studies with a larger, more open population are required to allow for comparisons between other specific groups, different regions in the country, and even different countries to corroborate these results.