A comparison of event models for naive bayes anti-spam E-mail filtering

Karl Michael Schneider

Conference Proceedings

A comparison of event models for naive bayes anti-spam E-mail filtering

Schneider K

10th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2003 (2003) 307-314

DOI: 10.3115/1067807.1067848

172Citations

191Readers

Get full text

Abstract

We describe experiments with a Naive Bayes text classifier in the context of anti- spam E-mail filtering, using two different statistical event models: a multi-variate Bernoulli model and a multinomial model. We introduce a family of feature ranking functions for feature selection in the multinomial event model that take account of the word frequency information. We present evaluation results on two publicly available corpora of legitimate and spam E-mails. We find that the multinomial model is less biased towards one class and achieves slightly higher accuracy than the multi-variate Bernoulli model.

Cite

CITATION STYLE

APA

Schneider, K. M. (2003). A comparison of event models for naive bayes anti-spam E-mail filtering. In 10th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2003 (pp. 307–314). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1067807.1067848

A comparison of event models for naive bayes anti-spam E-mail filtering

Abstract

Cite

Register to see more suggestions