Machine learning applied to the prediction of diabetes mellitus, using socioeconomic and environmental information from health system users

0Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Objective: The objective was to apply models based on machine learning techniques to support the early diagnosis of diabetes mellitus, using environmental, social, economic and health data variables, without dependence on clinical sample collection. Methodology: Data from 10,889 users affiliated with the subsidized health system in the southwestern area of Colombia, diagnosed with hypertension and grouped into users without (74.3%) and with (25.7%) diabetes mellitus, were used. Supervised models were trained using k-nearest neighbors, decision trees, and random forests, as well as ensemble-based models, applied to the database before and after balancing the number of cases in each diagnostic group. The performance of the algorithms was evaluated by dividing the database into training and test data (70/30, respectively), and metrics of accuracy, sensitivity, specificity, and area under the curve were used. Results: Sensitivity values increased significantly when using balanced data, going from maximum values of 17.1% (unbalanced data) to values as high as 57.4% (balanced data). The highest value of area under the curve (0.61) was obtained with the ensemble models, by applying a balance in the amount of data for each group and by coding the categorical variables. The variables with the greatest weight were associated with hereditary aspects (24.65%) and with the ethnic group (5.59%), in addition to visual difficulty, low water consumption, a diet low in fruits and vegetables, and the consumption of salt and sugar. Conclusions: Although predictive models, using people's socioeconomic and environmental information, emerge as a tool for the early diagnosis of diabetes mellitus, their predictive capacity still needs to be improved.

Cite

CITATION STYLE

APA

Mejía, J. A., Oviedo-Benalcázar, M. A., Ordoñez, J. A., & Valencia, J. F. (2023). Machine learning applied to the prediction of diabetes mellitus, using socioeconomic and environmental information from health system users. Revista Facultad Nacional de Salud Publica, 41(2). https://doi.org/10.17533/udea.rfnsp.e351168

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free