An improved hierarchical Bayesian model of language for document classification

5Citations
Citations of this article
81Readers
Mendeley users who have this article in their library.

Abstract

This paper addresses the fundamental problem of document classification, and we focus attention on classification problems where the classes are mutually exclusive. In the course of the paper we advocate an approximate sampling distribution for word counts in documents, and demonstrate the model's capacity to outperform both the simple multinomial and more recently proposed extensions on the classification task. We also compare the classifiers to a linear SVM, and show that provided certain conditions are met, the new model allows performance which exceeds that of the SVM and attains amongst the very best published results on the Newsgroups classification task. © 2008 Licensed under the Creative Commons.

Cite

CITATION STYLE

APA

Ben, A. (2008). An improved hierarchical Bayesian model of language for document classification. In Coling 2008 - 22nd International Conference on Computational Linguistics, Proceedings of the Conference (Vol. 1, pp. 25–32). https://doi.org/10.3115/1599081.1599085

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free