Comparing three data mining algorithms for identifying the associated risk factors of type 2 diabetes

18Citations
Citations of this article
66Readers
Mendeley users who have this article in their library.

Abstract

Background: Increasing the prevalence of type 2 diabetes has given rise to a global health burden and a concern among health service providers and health administrators. The current study aimed at developing and comparing some statistical models to identify the risk factors associated with type 2 diabetes. In this light, artificial neural network (ANN), support vector machines (SVMs), and multiple logistic regression (MLR) models were applied, using demographic, anthropometric, and biochemical characteristics, on a sample of 9528 individuals from Mashhad City in Iran. Methods: This study has randomly selected 6654 (70%) cases for training and reserved the remaining 2874 (30%) cases for testing. The three methods were compared with the help of ROC curve. Results: The prevalence rate of type 2 diabetes was 14% in our population. The ANN model had 78.7% accuracy, 63.1% sensitivity, and 81.2% specificity. Also, the values of these three parameters were 76.8%, 64.5%, and 78.9%, for SVM and 77.7%, 60.1%, and 80.5% for MLR. The area under the ROC curve was 0.71 for ANN, 0.73 for SVM, and 0.70 for MLR. Conclusion: Our findings showed that ANN performs better than the two models (SVM and MLR) and can be used effectively to identify the associated risk factors of type 2 diabetes.

Cite

CITATION STYLE

APA

Esmaeily, H., Tayefi, M., Ghayour-Mobarhan, M., & Amirabadizadeh, A. (2015). Comparing three data mining algorithms for identifying the associated risk factors of type 2 diabetes. Iranian Biomedical Journal, 22(5), 303–311. https://doi.org/10.29252/ibj.22.5.303

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free