Towards automatic and optimal filtering levels for feature selection in Text Categorization

2Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Text Categorization (TC) is an important issue within Information Retrieval (IR). Feature Selection (FS) becomes a crucial task, because of the presence of irrelevant features causing a loss in the performance. FS is usually performed selecting the features with highest score according to certain measures. However, the disadvantage of these approaches is that they need to determine in advance the number of features that are selected, commonly defined by the percentage of words removed, which is called Filtering Level (FL). In view of that, it is usual to carry out a set of experiments manually taking several FLs representing all possible ones. This process does not guarantee that any of the FLs chosen are the optimal ones, even not an approximation. This paper deals with overcoming this difficulty proposing a method that automatically determines optimal FLs by means of solving a univariate maximization problem. © Springer-Verlag Berlin Heidelberg 2005.

Cite

CITATION STYLE

APA

Montañés, E., Combarro, E. F., Díaz, I., & Ranilla, J. (2005). Towards automatic and optimal filtering levels for feature selection in Text Categorization. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3646 LNCS, pp. 239–248). Springer Verlag. https://doi.org/10.1007/11552253_22

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free