Abstract
Identification of genes associated with Diabetes Mellitus is important for early detection of this disease. This study tried to find some potential genes related to T2DM. The dataset used was GSE25462 and the method used was penalized logistic regression, specifically Lasso. The top eight selected genes were ABRA, EVX1, MIR7-3HG, SAYSD1, SLC26A1, SRGAP3, WFDC1, and 240244_at. The training data reaches the accuracy and kappa of 1 for the model with 8 genes. But, when the model is used for testing data the maximum accuracy is 0.9 and the maximum kappa is 0.615, obtained in models with 14 genes. This happened because the dataset lacked samples of the positive class. The use of ensemble learning methods is recommended to combine predictive results. The role of some genes we found in T2DM remains unclear. Biology researchers can further study the role of these genes in T2DM.
Author supplied keywords
Cite
CITATION STYLE
Rochayani, M. Y., Hakim, A. R., & Sugito. (2023). Identifying Genes Related to Diabetes Mellitus Using Penalized Logistic Regression. Journal of Soft Computing and Data Mining, 4(2), 35–42. https://doi.org/10.30880/jscdm.2023.04.02.003
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.