Comparative study of feature reduction and machine learning methods for spam detection

2Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Nowadays, e-mail iswidely used for communication over Internet.Alarge amount of Internet traffic is of e-mail data. A lot of companies and organizations use e-mail services to promote their products and services. It is very important to filter out spam messages to save users’ precious time. Machine learning methods plays vital role in spam detection, but it faces the problem of high dimensionality of feature vector. So feature reduction methods are very important for better results from machine learning approaches. In this paper, Principal Component Analysis (PCA), Singular Value Decomposition (SVD), and Information Gain (IG) methods are used for feature reduction. Further, e-mailmessages are classified as spam or ham message using seven different classifiers namely Naïve Baysian, AdaBoost, Random Forest, Support Vector Machine, J48, Bagging, and JRip. Comparative study of these techniques is done on TREC 2007 Spam e-mail Corpus with different feature size.

Cite

CITATION STYLE

APA

Agarwal, B., & Mittal, N. (2014). Comparative study of feature reduction and machine learning methods for spam detection. In Advances in Intelligent Systems and Computing (Vol. 236, pp. 761–769). Springer Verlag. https://doi.org/10.1007/978-81-322-1602-5_81

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free