Combination of K-Nearest Neighbor and K-Means based on Term Re-weighting for Classify Indonesian News

  • WiraBuana P
  • Jannet D.R.M. S
  • Ketut Gede Darma Putra I
N/ACitations
Citations of this article
61Readers
Mendeley users who have this article in their library.

Abstract

KNN is one of the accepted classification tool, it used all training samples in the classification which cause to a high level of computation complexity.To resolve this problem, it is necessary to combine traditional KNN algorithm and K-Means cluster algorithm that is proposed in this paper.After completing the preprocessing step, the first thing to do is weighting the word (term) by usingTerm Frequency-Inverse Document Frequency (TF-IDF). TF-IDF weightedthe words calculating the number of words that appear in a document. Second, grouping all the training samples of each category of K-means algorithm, and take all the cluster centers as the new training sample. Third, the modified training samples are used for classification with KNN algorithm. Finally, calculate the accuracy of the evaluation using precision, recall and f-measure. The simulation results show that the combination of the proposed algorithm in this study has a percentage accuracy reached 87%, an average value of f-measure evaluation= 0.8029 with the best k-values= 5 and the computation takes 55 second for one document.

Cite

CITATION STYLE

APA

WiraBuana, P., Jannet D.R.M., S., & Ketut Gede Darma Putra, I. (2012). Combination of K-Nearest Neighbor and K-Means based on Term Re-weighting for Classify Indonesian News. International Journal of Computer Applications, 50(11), 37–42. https://doi.org/10.5120/7817-1105

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free