Gamma-Poisson Distribution Model for Text Categorization

  • Ogura H
  • Amano H
  • Kondo M
N/ACitations
Citations of this article
10Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

We introduce a new model for describing word frequency distributions in documents for automatic text classification tasks. In the model, the gamma-Poisson probability distribution is used to achieve better text modeling. The framework of the modeling and its application to text categorization are demonstrated with practical techniques for parameter estimation and vector normalization. To investigate the efficiency of our model, text categorization experiments were performed on 20 Newsgroups, Reuters-21578, Industry Sector, and TechTC-100 datasets. The results show that the model allows performance comparable to that of the support vector machine and clearly exceeding that of the multinomial model and the Dirichlet-multinomial model. The time complexity of the proposed classifier and its advantage in practical applications are also discussed.

Cite

CITATION STYLE

APA

Ogura, H., Amano, H., & Kondo, M. (2013). Gamma-Poisson Distribution Model for Text Categorization. ISRN Artificial Intelligence, 2013, 1–17. https://doi.org/10.1155/2013/829630

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free