With the continuous growth of email users, unsolicited emails also known as Spam increases to a large extent. In current, server and client side anti spam filters are developed for detecting different features of spam emails. However, recently spammers introduced some new tricks consisting of embedding spam contents into digital image, pdf and doc files as attachments, which can make all current techniques based on the analysis of digital text in the body and subject field of emails ineffective. In this paper we proposed an anti spam filtering approach based on data mining techniques which classify the spam and ham emails. The effectiveness of proposed approach is experimentally evaluated on large corpus of simple text datasets as well as text embedded image datasets and comparisons between some classifiers such as Random Forest and Naive Bayes is done.
CITATION STYLE
Sharma, A. K., Kaur, P., & Anand, S. K. (2014). Evaluation of content based spam filtering using data mining approach applied on text and image corpus. In Advances in Intelligent Systems and Computing (Vol. 258, pp. 561–577). Springer Verlag. https://doi.org/10.1007/978-81-322-1771-8_50
Mendeley helps you to discover research relevant for your work.