This paper illustrates the applications of various ensemble methods for enhanced classification accuracy. The case in point is the Pima Indian Diabetic Dataset (PIDD). The computational model comprises of two stages. In the first stage, k-means clustering is employed to identify and eliminate wrongly classified instances. In the second stage, a fine tuning in the classification was effected. To do this, ensemble methods such as AdaBoost, bagging, dagging, stacking, decorate, rotation forest, random subspace, MultiBoost and grading were invoked along with five chosen base classifiers, namely support vector machine (SVM), radial basis function network (RBF), decision tree J48, naïve Bayes and Bayesian network. The k-fold cross validation technique is adopted. Computational experiments with the proposed method showed an improvement of 16.14% to 22.49% in the classification accuracy compared to literature survey. Among the ensemble methods tried, MultiBoost ensemble with SVM classifier and grading ensemble with naïve Bayes showed the best performance followed by MultiBoost, stacking and grading ensemble with Bayesian classifier, rotation forest ensemble with RBF and grading and rotation forest ensemble with J48. This investigation conclusively proves the significance of cascading k-means clustering with ensemble methods in the enhanced accuracy in categorization of diabetic dataset. © de Gruyter 2012.
CITATION STYLE
Karegowda, A. G., Jayaram, M. A., & Manjunath, A. S. (2012). Cascading k-means with Ensemble Learning:Enhanced Categorization of Diabetic Data. Journal of Intelligent Systems, 21(3), 237–253. https://doi.org/10.1515/jisys-2012-0010
Mendeley helps you to discover research relevant for your work.