A comparison of event models for naive bayes anti-spam E-mail filtering

172Citations
Citations of this article
191Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We describe experiments with a Naive Bayes text classifier in the context of anti- spam E-mail filtering, using two different statistical event models: a multi-variate Bernoulli model and a multinomial model. We introduce a family of feature ranking functions for feature selection in the multinomial event model that take account of the word frequency information. We present evaluation results on two publicly available corpora of legitimate and spam E-mails. We find that the multinomial model is less biased towards one class and achieves slightly higher accuracy than the multi-variate Bernoulli model.

Cite

CITATION STYLE

APA

Schneider, K. M. (2003). A comparison of event models for naive bayes anti-spam E-mail filtering. In 10th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2003 (pp. 307–314). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1067807.1067848

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free