Text categorization using weight adjusted k-nearest neighbor classification

Eui Hong Sam Han; George Karypis; Vipin Kumar

Conference Proceedings

Text categorization using weight adjusted k-nearest neighbor classification

Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (2001) 2035 53-65

DOI: 10.1007/3-540-45357-1_9

209Citations

214Readers

Get full text

Abstract

Text categorization presents unique challenges due to the large number of attributes present in the data set, large number of training samples, attribute dependency, and multi-modality of categories. Existing classification techniques have limited applicability in the data sets of these natures. In this paper, we present a Weight Ad- justed k-Nearest Neighbor (WAKNN) classification that learns feature weights based on a greedy hill climbing technique. We also present two performance optimizations of WAKNN that improve the computational performance by a few orders of magnitude, but do not compromise on the classification quality. We experimentally evaluated WAKNN on 52 document data sets from a variety of domains and compared its performance against several classification algorithms, such as C4.5, RIPPER, Naive-Bayesian, PEBLS and VSM. Experimental results on these data sets confirm that WAKNN consistently outperforms other existing classification algorithms.

Author supplied keywords

Cite

CITATION STYLE

APA

Han, E. H. S., Karypis, G., & Kumar, V. (2001). Text categorization using weight adjusted k-nearest neighbor classification. In Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 2035, pp. 53–65). Springer Verlag. https://doi.org/10.1007/3-540-45357-1_9

Text categorization using weight adjusted k-nearest neighbor classification

Abstract

Author supplied keywords

Cite

Register to see more suggestions