Abstract
We introduce a new method of feature selection for text categorization. Our MMR-based feature selection method strives to reduce redundancy between features while maintaining information gain in selecting appropriate features for text categorization. Empirical results show that MMR-based feature selection is more effective than Koller & Sahami’s method, which is one of greedy feature selection methods, and conventional information gain which is commonly used in feature selection for text categorization. Moreover, MMR-based feature selection sometimes produces some improvements of conventional machine learning algorithms over SVM which is known to give the best classification accuracy.
Cite
CITATION STYLE
Lee, C., & Lee, G. G. (2004). MMR-based feature selection for text categorization. In HLT-NAACL 2004 - Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, Short Papers (pp. 5–8). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1613984.1613986
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.