Maximum entropy modeling with feature selection for text categorization

Jihong Cai; Fei Song

Conference Proceedings

Maximum entropy modeling with feature selection for text categorization

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2008) 4993 LNCS 549-554

DOI: 10.1007/978-3-540-68636-1_62

12Citations

7Readers

Get full text

Abstract

Maximum entropy provides a reasonable way of estimating probability distributions and has been widely used for a number of language processing tasks. In this paper, we explore the use of different feature selection methods for text categorization using maximum entropy modeling. We also propose a new feature selection method based on the difference between the relative document frequencies of a feature for both relevant and irrelevant classes. Our experiments on the Reuters RCV1 data set show that our own feature selection performs better than the other feature selection methods and maximum entropy modeling is a competitive method for text categorization. © 2008 Springer-Verlag Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Cai, J., & Song, F. (2008). Maximum entropy modeling with feature selection for text categorization. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4993 LNCS, pp. 549–554). https://doi.org/10.1007/978-3-540-68636-1_62

Maximum entropy modeling with feature selection for text categorization

Abstract

Author supplied keywords

Cite

Register to see more suggestions