SVM-Based Feature Selection and Classification for Email Filtering

Sebastían Maldonado; Gaston L’Huillier

Book Chapter

SVM-Based Feature Selection and Classification for Email Filtering

Maldonado S
L’Huillier G

DOI: 10.1007/978-3-642-36530-0_11

N/ACitations

10Readers

Get full text

Abstract

The email inbox is indeed a dangerous place, but using pattern recognition tools it is possible to filter most wasteful elements that may cause damage to end users. Furthermore, as phishing and spam strategies have shown an adversarial and dynamic behavior, the number of variables to be considered for a proper email classification has increased substantially over time. For many years these elements have driven pattern recognition and machine learning communities to keep improving email filtering techniques. This work presents an embedded feature selection approach that determines a non-linear decision boundary with minimal error and a reduced number of features by penalizing their use in the dual formulation of binary Support Vector Machines (SVMs). The proposed method optimizes the width of an anisotropic RBF Kernel via successive gradient descent steps, eliminating those features that have low relevance for the model. Experiments with two real-world spam and phishing data sets demonstrate that our approach has a better performance than well-known feature selection algorithms while consistently using a smaller number of variables.

Cite

CITATION STYLE

APA

Maldonado, S., & L’Huillier, G. (2013). SVM-Based Feature Selection and Classification for Email Filtering (pp. 135–148). https://doi.org/10.1007/978-3-642-36530-0_11

SVM-Based Feature Selection and Classification for Email Filtering

Abstract

Cite

Register to see more suggestions