With the increase in patients suffering from diabetes, early detection and prediction of diabetes are the major area of concern. In this study, we propose a hybrid model using data mining techniques to analyze the available data to predict the occurrence of diabetes. This model is a combination of cluster and class-based approach which uses K-means and weighted K-means for clustering and logistic regression for classification. K-means is a simple and widely used technique, but it is highly sensitive toward initial centroids and outliers which further affect the prediction accuracy of logistic regression. The aim is to determine a way to improve the initial centroid selection for K-means and retain maximum original dataset to enhance the performance of logistic regression. Results show that accuracy of the classification model using K-means and weighted K-means is 96.97% and 97.84%, respectively. Further, using the classification results, this paper analyzes the risk associated with diabetic and non-diabetic patients.
CITATION STYLE
Bhavna, Verma, R., Handa, R., & Puri, V. (2021). A Hybrid Approach for Diabetes Prediction and Risk Analysis Using Data Mining. In Lecture Notes in Electrical Engineering (Vol. 668, pp. 1213–1230). Springer. https://doi.org/10.1007/978-981-15-5341-7_92
Mendeley helps you to discover research relevant for your work.