Classification and biomarker selection in lower-grade glioma using robust sparse logistic regression applied to RNA-seq data

João Carrilho; Marta B. Lopes

doi:10.28951/bjb.v40i4.634

PDF

Published: Dec 31, 2022

DOI: https://doi.org/10.28951/bjb.v40i4.634

Keywords:

Glioma Classification Sparse Logistic regression Robust Statistics Elastic net regularization

João Carrilho

NOVA School of Science and Technology

Marta B. Lopes

Center for Mathematics and Applications (NOVA MATH), NOVA School of Science and Technology,NOVA Laboratory for Computer Science and Informatics

Abstract

Effective diagnosis and treatment in cancer is a barrier for the development of personalized medicine, mostly due to tumor heterogeneity. In the particular case of gliomas, highly heterogeneous brain tumors at the histological, cellular and molecular levels, and exhibiting poor prognosis, the mechanisms behind tumor heterogeneity and progression remain poorly understood. The recent advances in biomedical high-throughput technologies have allowed the generation of large amounts of molecular information from the patients that combined with statistical and machine learning techniques can be used for the definition of glioma subtypes and targeted therapies, an invaluable contribution to disease understanding and effective management.
In this work sparse and robust sparse logistic regression models with the elastic net penalty were applied to glioma RNA-seq data from The Cancer Genome Atlas (TCGA), to identify relevant transcriptomic features in the separation between lower-grade glioma (LGG) subtypes and identify putative outlying observations. In general, all classification models yielded good accuracies, selecting different sets of genes. Among the genes selected by the models, TXNDC12, TOMM20, PKIA, CARD8 and TAF12 have been reported as genes with relevant role in glioma development and progression. This highlights the suitability of the present approach to disclose relevant genes and fosters the biological validation of non-reported genes.

How to Cite

Carrilho, J., & Lopes, M. B. (2022). Classification and biomarker selection in lower-grade glioma using robust sparse logistic regression applied to RNA-seq data. Brazilian Journal of Biometrics, 40(4), 371–381. https://doi.org/10.28951/bjb.v40i4.634

Issue

Vol. 40 No. 4 (2022): 40th Anniversary of the Brazilian Journal of Biometrics

Section

Articles

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors who publish with this journal agree to the following terms:

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).

Article Sidebar

Main Article Content

Abstract

Article Details