Document clustering or unsupervised document classification has been used to enhance information retrieval. Recently this has become an intense area of research due to its practical importance. Outliers are the elements whose similarity to the centroid of the corresponding category is below some threshold value. In this paper, we show that excluding outliers from the noisy training data significantly improves the performance of the centroid-based classifier which is the best known method. The proposed method performs about 10% better than the centroid-based classifier. © Springer-Verlag Berlin Heidelberg 2006.
CITATION STYLE
Shin, K., Abraham, A., & Han, S. (2006). Enhanced centroid-based classification technique by filtering outliers. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4188 LNCS, pp. 159–163). Springer Verlag. https://doi.org/10.1007/11846406_20
Mendeley helps you to discover research relevant for your work.