Combination of K-Nearest Neighbor and K-Means based on Term Re-weighting for Classify Indonesian News

Putu WiraBuana; Sesaltina Jannet D.R.M.; I Ketut Gede Darma Putra

Journal ArticleOPEN ACCESS

Combination of K-Nearest Neighbor and K-Means based on Term Re-weighting for Classify Indonesian News

WiraBuana P
Jannet D.R.M. S
Ketut Gede Darma Putra I

International Journal of Computer Applications (2012) 50(11) 37-42

DOI: 10.5120/7817-1105

N/ACitations

61Readers

Abstract

KNN is one of the accepted classification tool, it used all training samples in the classification which cause to a high level of computation complexity.To resolve this problem, it is necessary to combine traditional KNN algorithm and K-Means cluster algorithm that is proposed in this paper.After completing the preprocessing step, the first thing to do is weighting the word (term) by usingTerm Frequency-Inverse Document Frequency (TF-IDF). TF-IDF weightedthe words calculating the number of words that appear in a document. Second, grouping all the training samples of each category of K-means algorithm, and take all the cluster centers as the new training sample. Third, the modified training samples are used for classification with KNN algorithm. Finally, calculate the accuracy of the evaluation using precision, recall and f-measure. The simulation results show that the combination of the proposed algorithm in this study has a percentage accuracy reached 87%, an average value of f-measure evaluation= 0.8029 with the best k-values= 5 and the computation takes 55 second for one document.

Cite

CITATION STYLE

APA

WiraBuana, P., Jannet D.R.M., S., & Ketut Gede Darma Putra, I. (2012). Combination of K-Nearest Neighbor and K-Means based on Term Re-weighting for Classify Indonesian News. International Journal of Computer Applications, 50(11), 37–42. https://doi.org/10.5120/7817-1105

Combination of K-Nearest Neighbor and K-Means based on Term Re-weighting for Classify Indonesian News

Abstract

Cite

Register to see more suggestions