Abstract
We describe experiments with a Naive Bayes text classifier in the context of anti- spam E-mail filtering, using two different statistical event models: a multi-variate Bernoulli model and a multinomial model. We introduce a family of feature ranking functions for feature selection in the multinomial event model that take account of the word frequency information. We present evaluation results on two publicly available corpora of legitimate and spam E-mails. We find that the multinomial model is less biased towards one class and achieves slightly higher accuracy than the multi-variate Bernoulli model.
Cite
CITATION STYLE
Schneider, K. M. (2003). A comparison of event models for naive bayes anti-spam E-mail filtering. In 10th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2003 (pp. 307–314). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1067807.1067848
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.