1. Abstract 1.1. Aims: The sudden outbreaking of COVID-19 worldwide has brought into sharp increased burden of economic and treatment. How to simply, quickly and accurately assess the severity of patients with COVID-19 in the early stage after hospital admission is essential for healthcare systems.
1.2. Methods: To support precise decision making and clinical planning in hospitals, 84 blood samples of patients with COVID-19 who confirmed in the First Affiliated Hospital of the University of Science and Technology at Anhui and 25 blood samples of patients with COVID-19 in two hospitals at Shantou were collected. Machine learning tools were introduced to explore and validate the most significant predictive laboratory indicators of assessing the severity of disease.
1.3. Results: A new model combing four significant potential biomarkers including C-reactive protein, albumin, globulin, and sodium levels was applied to predict the severity of patients with COVID-19. Comparing to three current popular assessment systems for pneumonia, we found that the new model’s accuracy of the prediction performed better by using the AUC index, NRI index and the net benefit method.
1.4. Conclusions: In conclusion, our study was demonstrated to be a simple and operable severity assessment model to quickly predict the severity of patients after hospital admission. It ensured patients with higher severity to get treatment priorities and reduce the burden of the healthcare systems
2. Introduction Coronavirus disease (COVID-19), a coronavirus pneumonia, is a highly infectious disease and is an ongoing outbreak in the world. Symptoms of patients with COVID-19 always include fever, cough, fatigue and respiratory complications. Increasing mortality rate is the heaviest burden of COVID-19 worldwide because of the insufficient medical conditions. Hence, identifying patients with severe/ critical disease is important because of high mortality rates among hospitalized patients. The ability to evaluate disease severity is crucial because it guides therapeutic options and helps clinicians make clinical decisions [1]. Up to date, there is no suitable prediction biomarkers for clinicians to classify patients with COVID-19 who require immediate medical attention after hospital admission. The capacity to assess the severity of patients with COVID-19 has become an urgent challenge. There are three main pneumonia severity scoring systems are applied to help us to assess the severity of patents with COVID-19 in clinical trial, including the Clinical Pulmonary Infection Score (CPIS), Confusion-Urea-Respiratory Rate-Blood pressure-65 (CURB-65) and the pneumonia severity index (PSI). However, none of them specifically designed for COVID-19 has been reported. Hankunyuan et al. reported a higher proportion of older than young and middle-aged COVID-19 patients with PSI grade IV and V [2]. Wu suggested that the PSI can be used to stratify patients with COVID-19 after hospitalization [3]. Liu indicated that the increase in CURB-65 score occurred concomitantly with the aggravation of acute respiratory distress syndrome in patients with COVID-19 [4]. In a multi-center study in Zhejiang province, patients with COVID-19 were classified by PSI and CURB-65 together, treated as a supplementary classification system for clinical assessment after admission [5]. According to the “Diagnosis and treatment protocol for novel coronavirus pneumonia (Trial version 6)” published by the National Health Commission of China, only patients with decreased arterial oxygen partial pressure or respiratory distress could be classified as having severe/critical disease [6]. In order to reduce the pressure on healthcare system and ensure patients with higher severity to get treatment priorities, we aim to explore a simple and operable severity assessment to quickly assess the severity of patients after hospital admission. To identify some robust and interpretable biomarkers to assess the severity of patients, we developed and validated a mathematical model based on laboratory characteristics in three retrospective cohort studies from 3 hospitals in 2 provinces in China. Finally, we also compared the discriminate accuracy of these selected significant biomarkers with that of the CPIS, CURB-65 and PSI.
3. Materials & Methods 3.1. Study Design and Patients This was a retrospective study of three cohorts with COVID-19 diagnosed the “Diagnosis and treatment protocol for novel coronavirus pneumonia (Trial version 6)” published by the National Health Commission of China [6]. The derivation cohort comprised 84 patients admitted from January 20 to February 20, 2020 to the First Affiliated Hospital of the University of Science and Technology of China. The validation data were for 13 patients from the First Affiliated Hospital of Shantou University Medical College and 12 patients from Shantou Central Hospital admitted from January 19 to February 20, 2020. All these patients were confirmed to have SARS-CoV-2 infection by RT-PCR of samples from the respiratory tract by the Centers for Disease Control and Prevention. Patients with COVID-19 were divided into mild, common, severe and critical groups according to Chinese protocol for managing COVID-19 [6].
3.2. Data Collection We reviewed all clinical data, laboratory characteristics and chest CT scans (Table 1). The clinical data included demographic information, underlying comorbidities, symptoms and signs. Laboratory characteristics included routine blood tests, biomarkers for monitoring functions of multiple organs, and infection-related biomarkers. All data were collected within 24 hr after admission. According to the guideline for patients with confirmed COVID-19 from the National Health Commission in China, patients with mild clinical presentation (no pneumonia) may not initially require hospitalization. Hence, we removed data for 3 patients with mild disease from the derivation cohort because of possible bias. In addition, only 6 patients with a critical clinical presentation were confirmed after hospitalization, we merged 6 patients with critical presentation into patients with severe clinical presentation to avoid bias. Data for 81 patients with 35 variables were retained.
3.3. Statistical Analysis For the derivation cohort, some laboratory characteristics had missing data. After deleting 2 variables with high missing rate (>25%), we imputed the remaining data by using multiple imputation [7, 8]. We also handled the collinearity and filtering with mis-measured outliers by considering the results of variance inflation factor and correlation analysis together [9,10]. A model including 35 candidate predictors was fitted by using cforest implementation with the Random Forest (RF) classification model [11]. During this analysis, the importance of various conditioning factors can be measured quantitatively, and we found several negative importance variables. We kept running a loop function to remove negative values. The importance of severity of COVID-19–related variables was weighted by using the weight of evidence (woe) method to improve the predictive accuracy [12]. Finally, four significant selected biomarkers were selected by using a generalized linear model (glm) with the stepwise Bayesian information criterion method. The prediction model was depicted by the nomogram. Internal validation was conducted 100 times by spitting 80% of data into a training set with ntrain = 62 samples and 20% into a test set with ntest = 15 samples. Then we counted the total times each predictive variable was present in each model. Moreover, external validation involved using data for 25 patients with COVID-19 from 2 hospitals in Shantou. We used 3 common ways to quantify the discrimination accuracy of these 4 models with the validation cohort. The area under the Receiver Operating Characteristic (ROC) Curve (AUC) was used to describe the diagnostic ability of a binary classifier system [13]. The Net Reclassification Index (NRI) was used to evaluate the improvement in risk prediction by adding a marker to a set of baseline predictors [14,15]. Decision Curve Analysis (DCA) was used to evaluate and compare prediction models that incorporate clinical consequence [16]. In this study, we used DCA to graphically describe the clinical usefulness of each classifier based on a potential threshold for misclassification (x axis) and the net benefit of using the model to risk-stratify patients (y axis) relative to assuming that no patient will be misclassified. Statistical analyses were performed with R v3.6.3 and p< 0.05 was considered statistically significant.
4. Results Table 1 describes participant characteristics. The derivation cohort and validation cohort showed few major differences existed. The derivation cohort was based on 84 patients with COVID-19 from Hefei. After filtering collinearity and outliers, 77 patients with COVID-19 were retained. The full model was approximated by a small model including the 14 most predictive variables by using RF. Only 13 predictive variables were retained after deleting variables with strength of evidence less than “very strong”. Figure 1 shows the weight of evidence of importance of variables related to severity of COVID-19. Four significant biomarkers were selected, including: CRP (P = 0.001), ALB (P = 0.014), GLB (P = 0.013) and sodium (P = 0.006) (Table 2). The nomogram was depicted in Figure 2 and the final prediction model was described in formula1: Logit(p)=76.579+0.064*CRP-0.259*ALB+0.287*GLB-0.567*- sodium. ……(1) In internal validation, the random-splitting was repeated 100 times and results are described in Table 3. CRP and sodium level appeared 100 times, ALB level 72 times and GLB level 85 times and so were selected to build the model in the training dataset. Table 2 also describes the results of other severity-measurement models. Both them demonstrated that four selected laboratory characteristics can be regarded as potential biomarkers for identifying the severity of patients with COVID-19. The AUC for the CPIS score was highest (AUCCPIS = 0.988) and that for four biomarkers was lower, 0.881 (Figure 2). However, the ability to discriminate patients with severe/critical and common disease was better by using four biomarkers than the CPIS, mainly because the CPIS overestimates the variance when the AUC is close to 1 and it is not realistic in clinical trials [17]. DCA demonstrated that the prediction model built by four biomarkers improved the accuracy of classification against the threshold probabilities of three popular classifiers. Table 4 suggested that the new prediction model was the best in the 4 systems because the values of three NRIs were > 0. The new prediction model was always superior to other 3 models across a wide range of threshold probabilities (Figure 3). For example, the highest difference between the new prediction model and CPIS was at a threshold probability around 0.41. At that threshold, the net benefit for the new prediction model was about 0.29 and 0.1 for CPIS. At that threshold, using the new prediction model over the CPIS to classify patients and make clinical decisions, the probability of more profitable treatment was 28% (95% CI 0.29-0.1).
5. Discussion There were two notable jobs in our study. Firstly, four high-risk factors were found. They had been demonstrated to precisely and quickly quantify the severity of patients with COVID-19. Secondly, these four significant predicted biomarkers can be easily obtained in any hospital. A simple and easily operable model was very useful because it can help clinicians to quickly identify the severity of patients after hospital admission when the medical resources were limited. Our study developed and externally validated a new severity measurement specifically designed to assess patients with COVID-19. It had better discriminative ability than other measures in 3 cohorts. It classified patients with COVID-19 into a common group and severe/critical disease group with higher accuracy than 3 existing popular classifiers for pneumonia. The NRI and DCA analysis also demonstrated that it was the best classifier. The discriminative ability of it was also externally validated. Four biomarkers were thought as potential risk factors of COVID-19 in our study. Firstly, we found that CRP level were positively correlated with the severity of COVID-19. It was consistent with some previous studies. CRP can activate the complement system to enhance the regulation of lymphocytes and promote the phagocytosis of macrophages to eliminate the invading pathogens [18, 19]. Some studies of COVID-19 showed that CRP level was significantly increased specifically in patients with severe disease [20, 21]. The reason might be some inflammatory factors such as interleukin 6, interleukin 1, tumor necrosis α could promote the synthesis of CRP by hepatocytes [18]. Ko et al. found that CRP≥2mg/dl was one of the predictive factors for pneumonia development of Middle East respiratory syndrome (MERS), while CRP≥4mg/dl, low albumin level, male, hypertension, thrombocytopenia, lymphopenia were regarded as the predictive factors for respiratory failure [22]. A recent retrospective study also showed that CRP levels of patients with COVID-19 were also significantly higher in the death group on admission [23]. Liu et al. reported that IL-6 and CRP could be used as independent factors to predict the severity of COVID-19, and those patients were more likely to have severe complications while their CRP level larger 41.8mg/L [24]. Wang also suggested that CRP level can be regarded as an important biomarker in the early stage of COVID-19 because CRP could reflect lung lesions and disease severity [25]. Albumin was the second potential biomarker found in our study. It could be detected in the blood and was a protein made in the liver. Albumin could prevent leakage of the fluids from the blood into other organs [26]. Increasing number of studies showed that low albumin levels were associated with poorer outcomes of patients with COVID-19 [27]. Albumin concentration was suggested as an independent risk factor for mortality in patients with pneumonia and also found associated with COVID-19 [28,29]. A systematic reviewed and meta-analysis showed that hypoalbuminemia status increased risk of severe COVID-19 [30]. Our study also described that lower sodium was a risk factor for severe COVID-19 infection. Sodium was considered a predicator in several scoring systems for assessing pneumonia, including the PSI and Acute Physiology and Chronic Health Evaluation II. Hyponatremia was the most common electrolyte disorder in clinical practice and severe hyponatremia was associated with increased mortality [31]. Berni et al. found that sodium was inversely correlated with IL-6 in COVID-19 patients, directly correlated with PaO2/FiO2 ratio [32]. Stephan J.L Bakker gave a hypothesis about that low sodium balance may augment cellular damage at a certain virus load and increase the risk of developing severe and fatal COVID-19 infection by their experimental and epidemiological data [33]. Finally, globulin was suggested to be positively relative with the severity of COVID-19. Yafei Zhang demonstrated that the globulin level in severe COVID-19 patients is significantly increased while comparing to the mild patients because the promoted immunoglobulin synthesis [27]. In addition, the CPIS, a diagnostic algorithm, is mainly applied for ventilator-associated pneumonia and community-acquired pneumonia. Most studies indicated that CPIS had inaccurate sensitivity and specificity [34-37]. The CPIS was suggested to have high inter-observer variability and is not available for multiple centers study [37, 38]. The CURB-65 score consists of 5 separate elements: confusion, uremia, respiratory rate, blood pressure, and age ≥ 65. The CURB-65 is relatively simple to use. The PSI involves 20 clinical variables defining 5 classes of increasing risk of mortality. It has been extensively validated. However, the inappropriate weights of age or inappropriate threshold values for both the PSI and CURB-65 result in a potential underestimation of severe pneumonia, especially in young people [39, 40]. A major limitation of the current study is the insufficient sample size. As more raw data will be collected in the future, we would have the ability to optimize our new model. Another limitation of our study was that we had to combine patients with critical presentation to severe presentation because there were only 6 patients with a critical clinical presentation in our study.
6. Conclusion In conclusion, this study has identified four indicators for evaluating the severity of COVID-19 after hospital admission. The simple and operable new prediction model using these four indicators can achieve convenient detection, early intervention and increasement the survival rate in patients with COVID-19.
References 1. Lim WS, van der Eerden MM, Laing R, Boersma WG, Karalus N, Town GI et al. Defining community acquired pneumonia severity on presentation to hospital: an international derivation and validation study. Thorax. 2003; 58: 377-82.
2. Liu K, Chen Y, Lin R, Han K. Clinical features of COVID-19 in elderly patients: A comparison with young and middle-aged patients. J Infect. 2020; 80:e14-8.
3. Tian S, Chang Z, Wang Y, Wu M, Zhang W, Zhou G, et al. Clinical characteristics and reasons of different duration from onset to release from quarantine for patients with COVID-19 Outside Hubei province, China, medRxiv. 2020; 7: 210.
4. Liu Y, Sun W, Li J, Chen L, Wang Y, Zhang L, et al. Clinical features and progression of acute respiratory distress syndrome in coronavirus disease 2019. MedRxiv. 2020.
5. Yang W, Cao Q, Qin L, Wang X, Cheng Z, Pan A, et al. Clinical characteristics and imaging manifestations of the 2019 novel coronavirus disease (COVID-19): A multi-center study in Wenzhou city, Zhejiang, China, J Infect. 2020; 80: 388-93.
6. Diagnosis and treatment guidelines for novel coronavirus pneumonia (draft version 6). 2020.
7. Carpenter J, Kenward M. Multiple imputation and its application, John Wiley & Sons. 2012.
8. Royston P. Multiple Imputation of Missing Values. The Stata Journal. 2004; 4: 227-41.
9. Akinwande MO, Dikko HG, Samson A. Variance Inflation Factor: As a Condition for the Inclusion of Suppressor Variable(s) in Regression Analysis, Open Journal of Statistics. 2015; 5: 62189.
10. Andrew G, Arora R, Bilmes J, Livescu K. Deep canonical correlation analysis, International conference on machine learning. 2013;1247-55.
11. Shi T, Horvath S. Unsupervised learning with random forest predictors. Journal of Computational and Graphical Statistics. 2006; 15:118-38.
12. Regmi NR, Giardino JR, Vitek JD. Modeling susceptibility to landslides using the weight of evidence approach: Western Colorado. USA, Geomorphology. 2010; 115: 172-87.
13. Fawcett T. An introduction to ROC analysis, Pattern recognition letters. 2006; 27: 861-74.
14. Pepe MS, Fan JS, Feng Z, Gerds T, Hilden J. The Net Reclassification Index (NRI): a Misleading Measure of Prediction Improvement Even with Independent Test Data Sets, Stat Biosci. 2015; 7: 282-95.
15. Pencina MJ, D’Agostino RB, D’Agostino DB, Vasan RS. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond, Stat Med. 2008; 27: 157-72.
16. Vickers AJ, Van Calster B, Steyerberg EW. Net benefit approaches to the evaluation of prediction models, molecular markers. and diagnostic tests, BMJ. 2016; 352: i6.
17. Hanley JA, Hajian-Tilaki KO. Sampling variability of nonparametric estimates of the areas under receiver operating characteristic curves: an update, Acad Radiol. 1997; 4: 49-58.
18. Sproston NR, Ashworth JJ. Role of C-Reactive Protein at Sites of Inflammation and Infection, Front Immunol. 2018. 9: 754.
19. Mac Giollabhui N, Ellman LM, Coe CL, Byrne ML, Abramson LY, Alloy LB. To exclude or not to exclude: Considerations and recommendations for C-reactive protein values higher than 10 mg/L. Brain Behav Immun. 2020; 87: 898-900.
20. Chen G, Wu D, Guo W, Cao Y, Huang D, Wang H, et al. Clinical and immunological features of severe and moderate coronavirus disease 2019, The Journal of Clinical Investigation. 2020; 130: 2620-9.
21. C Huang, Y Wang, X Li, L Ren, J Zhao, Y Hu, et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China, Lancet 395 (2020) 497-506. 10.1016/S0140-6736(20)30183-5.
22. JH Ko, GE Park, JY Lee, JY Lee, SY Cho, YE Ha, CI Kang, et al. Predictive factors for pneumonia development and progression to respiratory failure in MERS-CoV infected patients, J Infect 73 (2016) 468-475. 10.1016/j.jinf.2016.08.005.
23. Y Deng, W Liu, K Liu, YY Fang, J Shang, L Zhou, K Wang, et al. Clinical characteristics of fatal and recovered cases of coronavirus disease 2019 (COVID-19) in Wuhan, China: a retrospective study, Chin Med J (Engl) (2020). 10.1097/CM9.0000000000000824.
24. F Liu, L Li, M Xu, J Wu, D Luo, Y Zhu, B Li, X Song, et al. Prognostic value of interleukin-6, C-reactive protein, and procalcitonin in patients with COVID-19, J Clin Virol 127 (2020) 104370. 10.1016/j. jcv.2020.104370.
25. L Wang, C-reactive protein levels in the early stage of COVID-19, Med Mal Infect 50 (2020) 332-334. 10.1016/j.medmal.2020.03.007.
26. R de la Rica, M Borges, M Aranda, A del Castillo, A Socias, A Payeras, et al, Low albumin levels are associated with poorer outcomes in a case series of COVID-19 patients in Spain: a retrospective cohort study, Microorganisms. 2020 Jul 24;8(8):1106.
27. Y Zhang, L Zheng, L Liu, M Zhao, J Xiao, Q Zhao, Liver impairment in COVID-19 patients: A retrospective analysis of 115 cases from a single centre in Wuhan city, China, Liver Int (2020). 10.1111/ liv.14455.
28. Y Liu, Y Yang, C Zhang, F Huang, F Wang, J Yuan, Z Wang, et al, Clinical and biochemical indexes from 2019-nCoV infected patients linked to viral loads and lung injury, Sci China Life Sci 63 (2020) 364-374. 10.1007/s11427-020-1643-8.
29. Kim H, Jo S, Lee JB, Jin Y, Jeong T, Yoon J, Lee JM, Park B. Diagnostic performance of initial serum albumin level for predicting in-hospital mortality among aspiration pneumonia patients. Am J Emerg Med 2018; 36: 5-11.
30. Aziz M, Fatima R, Lee-Smith W, Assaly R. The association of low serum albumin level with severe COVID-19: a systematic review and meta-analysis, Crit Care. 2020; 24: 255.
31. Corona G, Giuliani C, Parenti G, Norello D, Verbalis JG, Forti G, Maggi M, Peri A. Moderate hyponatremia is associated with increased risk of mortality: evidence from a meta-analysis, PLoS One. 2013; 8: e80451.
32. Berni A, Malandrino D, Parenti G, Maggi M, Poggesi L, Peri A, Hyponatremia, IL-6, and SARS-CoV-2 (COVID-19) infection: may all fit together?, J Endocrinol Invest. 2020.
33. Post RPF, Dullaart SJL. Bakker. Is low sodium intake a risk factor for severe and fatal COVID-19 infection?, Eur J Intern Med.2020; 75: 109.
34. Harde Y, Rao SM, Sahoo JN, Bharuka A, Betham S, Pulla S. Detection of ventilator associated pneumonia, using clinical pulmonary infection score (CPIS) in critically ill neurological patients, journal of Anesthesiology and Clinical Science. 2013; 2.
35. Fartoukh M, Maitre B, Honore S, Cerf C, Zahar JR, Brun-Buisson C. Diagnosing pneumonia during mechanical ventilation: the clinical pulmonary infection score revisited, Am J Respir Crit Care Med. 2003; 168: 173-9.
36. Cook DJ, Walter SD, Cook RJ, Griffith LE, Guyatt GH, Leasa D, Jaeschke RZ, Brun-Buisson C. Incidence of and risk factors for ventilator-associated pneumonia in critically ill patients, Ann Intern Med. 1998; 129: 433-40.
37. Zilberberg MD, Shorr AF. Ventilator-associated pneumonia: the clinical pulmonary infection score as a surrogate for diagnostics and outcome. Clin Infect Dis. 2009; 51: 1.
38. Schurink CAM, Nieuwenhoven CAV, Jacobs JA, Rozenberg-Arska M, Joore HCA, Buskens E, Hoepelman AIM, Bonten MJM. Clinical pulmonary infection score for ventilator-associated pneumonia: accuracy and inter-observer variability, Intensive Care Med. 2004; 30: 217-24.
39. Chen JH, Chang SS, Liu JJ, Chan RC, Wu JY, Wang WC, Lee SH, Lee CC. Comparison of clinical characteristics and performance of pneumonia severity score and CURB-65 among younger adults, elderly and very old subjects, Thorax. 2010; 65: 971-7.
40. Shah BA, Ahmed W, Dhobi GN, Shah NN, Khursheed SQ, Haq I. Validity of pneumonia severity index and CURB-65 severity scoring systems in community acquired pneumonia in an Indian setting, Indian J Chest Dis Allied Sci. 52 (2010) 9-17.
Canhong Wen. Four Unique Laboratory Characteristics Applied to Assess the Severity of COVID-19. Annals of Clinical and Medical Case Reports 2022