Text categorization by a machine-learning-based term selection

4Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Term selection is one of the main tasks in Information Retrieval and Text Categorization. It has been traditionally carried out by statistical methods based on the frequency of appearance of the words in the documents. In this paper it is presented a method for extracting relevant words of a document by taking into account their linguistic information. These relevant words are obtained by a Machine Learning algorithm which takes manually selected words as training set. With the lexica obtained by this technique Text Categorization is performed by using Support Vector Machines. The results are compared with one of the most used method for term selection (based just on statistical information) and it is found the new method performs better and has the additional advantage of automatically selecting the filtering level. © Springer-Verlag Berlin Heidelberg 2004.

Cite

CITATION STYLE

APA

Fernández, J., Montañés, E., Díaz, I., Ranilla, J., & Combarro, E. F. (2004). Text categorization by a machine-learning-based term selection. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3180, 253–262. https://doi.org/10.1007/978-3-540-30075-5_25

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free