Classification and Analysis of Patients with COVID-19 Using Machine Learning

Main Article Content

Glaucia Maria Bressan
Elisângela da Silva Lizzi


The rapid spread of the Coronavirus disease (COVID-19) has demanded studies and research works from many areas of knowledge, searching for treatments, vaccines and preventive measures. This pandemic has become a very challenging situation due to its substantial demand for medical infrastructure. In this context, this paper proposes to apply Machine Learning methods to classify and to analyse the outcome of patients with COVID-19 as discharge or death and to describe the profile of patients infected by the coronavirus. The dataset consists of clinical data from Sírio Libanês Hospital, available in the FAPESP repository (2020). Results indicate that, among all tested classifiers, the Naive Bayes algorithm presents better performance and it better represents the phenomenon under study, demonstrating superiority in terms of classification and induction numerical analysis of the epidemiological phenomenon for COVID-19.

Article Details

How to Cite
Bressan, G. M., & da Silva Lizzi, E. (2023). Classification and Analysis of Patients with COVID-19 Using Machine Learning. Brazilian Journal of Biometrics, 41(1), 18–29.


Acter, T., Uddin, N., DAS, J., Akhter, A., Choudhury, T.R., Kim, S. Evolution of severe acute respiratory syndrome coronavirus 2 (sars-cov-2) as coronavirus disease 2019 (covid-19) pandemic: a global health emergency. Science of the Total Environment. 730, e138996 (2020).

Aggarwal, C.C. Data classification: algorithms and applications. (CRC Press, Yorktown Heights, New York, USA, 2014).

Ahmad, A., Garhwal, S., Ray, S.K., Kumar, G., Malebary, S.J., Barukab, O.M. The number of confirmed cases of covid-19 by using machine learning: methods and challenges. Archives of Computational Methods in Engineering. 28, 2645-2653 (2020).

Alelyani, S., Liu, H., Wang, L. The effect of the characteristics of the dataset on the selection stability. In: INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE. IEEE. Proceedings. 970-977 (2011).

Alimadadi, A., Aryal, S., Manandhar, I., Munroe, P.B., Joe, B., Cheng, X. Artificial intelligence and machine learning to fight covid-19. American Physiological Society Bethesda, MD, (2020).

Allam, M., Cai, S., Ganesh, S., Venkatesan, M., Doodhwala, S., Song, Z., HU, T., Kumar, A., Heit, J., Coskun, A.F., et al. Covid-19 diagnostics, tools, and prevention. Diagnostics. 10, 1-33 (2020).

Beckmann, J.S., Lew, D. Reconciling evidence-based medicine and precision medicine in the era of big data: challenges and opportunities. Genome medicine. 8, 1-11 (2016).

Box, G.E.P., Tiao, G.C. Bayesian inference in statistical analysis. (John Wiley and Sons, Canada, 1992).

Brazil. Ministry of Health. Coronavirus Panel Brazil. Available in <>, (accessed in April 26, 2022).

Breiman, L. Random forests. Machine learning. 45, 5-32 (2001).

Dong, E., Du, H., Gardner, L. An interactive web-based dashboard to track covid-19 in real time. The Lancet infectious diseases. 20, 533-534 (2020).

Dougherty, G. Pattern recognition and classification: an introduction. (Springer Science & Business Media, California, USA, 2012).

FAPESP. FAPESP COVID-19 Data Sharing/BR, (2020).

Gao, Y., Cai, G.Y., Fang, W., Li, H.Y., Wang, S.Y., Chen, L., Yu, Y., Liu, D., Xu, S., Cui, P.F., et al. Machine learning based early warning system enables accurate mortality risk prediction for covid-19. Nature communications. 11, 1-10 (2020).

Gordis, L. Epidemiology. (Elsevier Saunders, Philadelphia, PA, 2014).

Grzybowski, J.M.V., Da Silva, R.V., Rafikov, M., 2020. Expanded seircq model applied to covid-19 epidemic control strategy design and medical infrastructure planning. Mathematical Problems in Engineering. e8198563 (2020).

Han, J., Kamber, M., Pei, J., Data mining: concepts and techniques. (Morgan Kaufmann, Burlington, MA, USA, 2012).

Hou, W., Zhao, Z., Chen, A., Li, H., Duong, T.Q. Machining learning predicts the need for escalated care and mortality in COVID-19 patients from clinical variables. International Journal of Medical Sciences. 18 (8), 1739-1745 (2021).

Lalmuanawma, S., Hussain, J., Chhakchhuak, L. Applications of machine learning and artificial intelligence for covid-19 (sars-cov-2) pandemic: a review. Chaos, Solitons & Fractals, e110059 (2020).

Maimon, O.Z., Rokach, L. Data mining with decision trees: theory and applications. (World scientific, 2014).

Mello, L. E. et al. Opening Brazilian COVID-19 patient data to support world research on pandemics. Zenodo, (2020).

Nemati, M., Ansary, J., Nemati, N. Machine-learning approaches in covid-19 survival analysis and discharge-time likelihood prediction using clinical data. Patterns. 1, e100074 (2020).

R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria., (2020).

Ranzani, O.T., Bastos, L.S.L., Gelli, J.G.M., Marchesi, J.F., Baião, F., Hamacher, S., Bozza, F.A. Characterisation of the first 250 000 hospital admissions for covid-19 in Brazil: a retrospective analysis of nationwide data. The Lancet Respiratory Medicine. (2021).

Rodríguez-Morales, A., Macgregor, K., Kanagarajah, S., Patel, D., Schlagenhauf, P. Going global – travel and the 2019 novel coronavirus. Travel medicine and infectious disease. 33, e101578 (2020).

Steyerberg, E.W., Vickers, A.J., R., C.N., Gerds, T., Gonen, M., Obuchowski, N., J., P.M., Kattan, M.W. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology. 21, 128-138 (2010).

Vere, J., Gibson, B. Evidence-based medicine as science. Journal of Evaluation in Clinical Practice. 25, 997-1002 (2019).

Yadav, M., Perumal, M., Srinivas, M. Analysis on novel coronavirus (covid-19) using machine learning methods. Chaos, Solitons & Fractals. 139, 110050 (2020).

Zeng, X., Zhang, Y., Kwong, J.S.W., Zhang, C., Li, S., Sun, F., Niu, Y., Du, L. The methodological quality assessment tools for preclinical and clinical studies, systematic review and meta-analysis, and clinical practice guideline: a systematic review. Journal of evidence-based medicine. 8, 2-10 (2015).

Most read articles by the same author(s)