Investigating the effect of combining text clustering with classification on improving spam email detection

5Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Nowadays emails have been an easy and fast tool of communication among people. As a result, filtering unsolicited/spam emails has become a very important challenge to achieve. Recently there has been some research work in text mining that combines text clustering with classification to improve the classification performance. In this paper, we investigate the effect of combining text clustering using K-means algorithm with various supervised classification mechanisms on improving the performance of classification of emails into spam or non-spam. The conjunction of clustering and classification mechanisms is carried out by adding extra features from the clustering step to the feature space used for classification. Our results show that combining K-means clustering with supervised classification by this methodology does not always improve the classification performance. Moreover, for the cases that the classifiers performance is improved by clustering, we found that the performance of classifiers in terms of accuracy is slightly increased with a very small amount that does not meet the increase in the time taken for building a learning model that combines both mechanisms. The result of our experiment has been shown using the Enron-Spam datasets.

Cite

CITATION STYLE

APA

Hassan, D. (2017). Investigating the effect of combining text clustering with classification on improving spam email detection. In Advances in Intelligent Systems and Computing (Vol. 557, pp. 99–107). Springer Verlag. https://doi.org/10.1007/978-3-319-53480-0_10

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free