An examination of feature selection frameworks in text categorization

Chih How Bong; Wong Ting Kiong

Conference Proceedings

An examination of feature selection frameworks in text categorization

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2005) 3689 LNCS 558-564

DOI: 10.1007/11562382_50

12Citations

3Readers

Get full text

Abstract

Feature selection, an important task in text categorization, is used for the purpose of dimensionality reduction. Feature selection basically can be performed locally and globally. For local selection, distinct feature sets are derived from different classes. The number of feature set is thus depended on the number of class. In contrary, only one universal feature set will be used in global feature selection. It is assumed that the feature set should preserve the characteristic of all classes. Furthermore, feature selection can also be carried out based on relevant feature set only (local dictionary) or both relevant and irrelevant feature set (universal dictionary). In this paper, we explored the different frameworks of feature selection to the task of text categorization on the Reuters(10) and Reuters(115) datasets (variants of Reuters-21578 corpus). We then investigate the efficiency of 7 different local or global feature selections corresponds the use of local and universal dictionary. Our experiments have shown that local feature selection with local dictionary yields optimal categorization results. © Springer-Verlag Berlin Heidelberg 2005.

Cite

CITATION STYLE

APA

Bong, C. H., & Kiong, W. T. (2005). An examination of feature selection frameworks in text categorization. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3689 LNCS, pp. 558–564). https://doi.org/10.1007/11562382_50

An examination of feature selection frameworks in text categorization

Abstract

Cite

Register to see more suggestions