Removing smoothing from naive Bayes text classifier

Wang Bin Zhu; Ya Ping Lin; Mu Lin; Zhi Ping Chen

Conference Proceedings

Removing smoothing from naive Bayes text classifier

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2005) 3739 LNCS 713-718

DOI: 10.1007/11563952_69

0Citations

5Readers

Get full text

Abstract

Smoothing is applied in Bayes classifier when the maximum likelihood (ML) estimate can't solve the problem in the absence of some features in training data. However, smoothing doesn't have firm theoretic base to rely on as ML estimate does. In this paper, we propose two novel strategies to remove smoothing from the classifier without sacrificing classification accuracy. NB_TF and NB_TS. NB_TF adjusts the classifier by adding the test document before classification and it is suitable for online categorization. NB_TS improves the performance by adding the whole test set to the classifier in the training stage and it is more efficient for batch categorization. The experiments and analysis show that NB_TS outperforms Laplace additive smoothing and Simple Good-Turing (SGT) smoothing, and NB_TF performs better than Laplace additive smoothing. © Springer-Verlag Berlin Heidelberg 2005.

Cite

CITATION STYLE

APA

Zhu, W. B., Lin, Y. P., Lin, M., & Chen, Z. P. (2005). Removing smoothing from naive Bayes text classifier. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3739 LNCS, pp. 713–718). https://doi.org/10.1007/11563952_69

Removing smoothing from naive Bayes text classifier

Abstract

Cite

Register to see more suggestions