Reduction of training noises for text classifiers

0Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Automatic text classification (TC) is essential for the archiving and retrieval of texts, which are main ways of recording information and expertise. Previous studies thus have developed many text classifiers. They often employed training texts to build the classifiers, and showed that the classifiers had good performance in various application domains. However, as the training texts are often inevitably unsound or incomplete in practice, they often contain many terms not related to the categories of interest. Such terms are actually training noises in classifier training, and hence can deteriorate the performance of the classifiers. Reduction of the training noises is thus essential. It is also quite challenging as training texts are unsound or incomplete. In this paper, we develop a technique TNR (Training Noise Reduction) to remove the possible training noises so that the performance of the classifiers can be further improved. Given a training text d of a category c, TNR identifies a sequence of consecutive terms (in d) as the noises if the terms are not strongly related to c. A case study on the classification of Chinese texts of disease information shows that TNR can improve a Support Vector Machine (SVM) classifier, which is a state-of-the-art classifier in TC. The contribution is of significance to the further enhancement of existing text classifiers. © 2013 Springer-Verlag.

Cite

CITATION STYLE

APA

Liu, R. L. (2013). Reduction of training noises for text classifiers. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7803 LNAI, pp. 30–39). https://doi.org/10.1007/978-3-642-36543-0_4

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free