Naive bayes for text classification with unbalanced classes

155Citations
Citations of this article
257Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Multinomial naive Bayes (MNB) is a popular method for document classification due to its computational efficiency and relatively good predictive performance. It has recently been established that predictive performance can be improved further by appropriate data transformations [1,2]. In this paper we present another transformation that is designed to combat a potential problem with the application of MNB to unbalanced datasets. We propose an appropriate correction by adjusting attribute priors. This correction can be implemented as another data normalization step, and we show that it can significantly improve the area under the ROC curve. We also show that the modified version of MNB is very closely related to the simple centroid-based classifier and compare the two methods empirically. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Frank, E., & Bouckaert, R. R. (2006). Naive bayes for text classification with unbalanced classes. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4213 LNAI, pp. 503–510). Springer Verlag. https://doi.org/10.1007/11871637_49

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free