Comparative study of feature reduction and machine learning methods for spam detection

Basant Agarwal; Namita Mittal

Conference Proceedings

Comparative study of feature reduction and machine learning methods for spam detection

Advances in Intelligent Systems and Computing (2014) 236 761-769

DOI: 10.1007/978-81-322-1602-5_81

2Citations

5Readers

Get full text

Abstract

Nowadays, e-mail iswidely used for communication over Internet.Alarge amount of Internet traffic is of e-mail data. A lot of companies and organizations use e-mail services to promote their products and services. It is very important to filter out spam messages to save users’ precious time. Machine learning methods plays vital role in spam detection, but it faces the problem of high dimensionality of feature vector. So feature reduction methods are very important for better results from machine learning approaches. In this paper, Principal Component Analysis (PCA), Singular Value Decomposition (SVD), and Information Gain (IG) methods are used for feature reduction. Further, e-mailmessages are classified as spam or ham message using seven different classifiers namely Naïve Baysian, AdaBoost, Random Forest, Support Vector Machine, J48, Bagging, and JRip. Comparative study of these techniques is done on TREC 2007 Spam e-mail Corpus with different feature size.

Author supplied keywords

Cite

CITATION STYLE

APA

Agarwal, B., & Mittal, N. (2014). Comparative study of feature reduction and machine learning methods for spam detection. In Advances in Intelligent Systems and Computing (Vol. 236, pp. 761–769). Springer Verlag. https://doi.org/10.1007/978-81-322-1602-5_81

Comparative study of feature reduction and machine learning methods for spam detection

Abstract

Author supplied keywords

Cite

Register to see more suggestions