Skip to content

Email Spam Filtering

by Enrique Puertas Sanz, José Mária Gómez Hidalgo, José Carlos Cortizo Pérez
Advances in Computers ()
Get full text at journal


In recent years, email spam has become an increasingly important problem, with a big economic impact in society. In this work, we present the problem of spam, how it affects us, and how we can fight against it. We discuss legal, economic, and technical measures used to stop these unsolicited emails. Among all the technical measures, those based on content analysis have been particularly effective in filtering spam, so we focus on them, explaining how they work in detail. In summary, we explain the structure and the process of different Machine Learning methods used for this task, and how we can make them to be cost sensitive through several methods like threshold optimization, instance weighting, or MetaCost. We also discuss how to evaluate spam filters using basic metrics, TREC metrics, and the receiver operating characteristic convex hull method, that best suits classification problems in which target conditions are not known, as it is the case. We also describe how actual filters are used in practice. We also present different methods used by spammers to attack spam filters and what we can expect to find in the coming years in the battle of spam filters against spammers.

Author-supplied keywords

Cite this document (BETA)

Readership Statistics

31 Readers on Mendeley
by Discipline
87% Computer Science
6% Engineering
3% Arts and Humanities
by Academic Status
26% Student > Ph. D. Student
16% Student > Master
13% Researcher
by Country
3% Spain
3% Portugal
3% United States

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Sign up & Download

Already have an account? Sign in