A novel feature selection based on one-way ANOVA F-test for e-mail spam classification

Nadir Omer Fadl Elssied; Othman Ibrahim; Ahmed Hamza Osman

Journal ArticleOPEN ACCESS

A novel feature selection based on one-way ANOVA F-test for e-mail spam classification

Research Journal of Applied Sciences, Engineering and Technology (2014) 7(3) 625-638

DOI: 10.19026/rjaset.7.299

112Citations

144Readers

Abstract

Spam is commonly defined as unwanted e-mails and it became a global threat against e-mail users. Although, Support Vector Machine (SVM) has been commonly used in e-mail spam classification, yet the problem of high data dimensionality of the feature space due to the massive number of e-mail dataset and features still exist. To improve the limitation of SVM, reduce the computational complexity (efficiency) and enhancing the classification accuracy (effectiveness). In this study, feature selection based on one-way ANOVA F-test statistics scheme was applied to determine the most important features contributing to e-mail spam classification. This feature selection based on one-way ANOVA F-test is used to reduce the high data dimensionality of the feature space before the classification process. The experiment of the proposed scheme was carried out using spam base well-known benchmarking dataset to evaluate the feasibility of the proposed method. The comparison is achieved for different datasets, categorization algorithm and success measures. In addition, experimental results on spam base English datasets showed that the enhanced SVM (FSSVM) significantly outperforms SVM and many other recent spam classification methods for English dataset in terms of computational complexity and dimension reduction.

Author supplied keywords

Cite

CITATION STYLE

APA

Elssied, N. O. F., Ibrahim, O., & Osman, A. H. (2014). A novel feature selection based on one-way ANOVA F-test for e-mail spam classification. Research Journal of Applied Sciences, Engineering and Technology, 7(3), 625–638. https://doi.org/10.19026/rjaset.7.299

A novel feature selection based on one-way ANOVA F-test for e-mail spam classification

Abstract

Author supplied keywords

Cite

Register to see more suggestions