A probabilistic approach to feature selection for multi-class text categorization

Ke Wu; Bao Liarig Lu; Masao Uchiyama; Hitoshi Isahara

Conference Proceedings

A probabilistic approach to feature selection for multi-class text categorization

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2007) 4491 LNCS(PART 1) 1310-1317

DOI: 10.1007/978-3-540-72383-7_153

9Citations

12Readers

Get full text

Abstract

In this paper, we propose a probabilistic approach to feature selection for multi-class text categorization. Specifically, we regard document class and occurrence of each feature as events, calculate the probability of occurrence of each feature by the theorem on the total probability and utilize the values as a ranking criterion. Experiments on Reuters-2000 collection show that the proposed method can yield better performance than information gain and χ-square, which are two wellknown feature selection methods. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Wu, K., Lu, B. L., Uchiyama, M., & Isahara, H. (2007). A probabilistic approach to feature selection for multi-class text categorization. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4491 LNCS, pp. 1310–1317). Springer Verlag. https://doi.org/10.1007/978-3-540-72383-7_153

A probabilistic approach to feature selection for multi-class text categorization

Abstract

Cite

Register to see more suggestions