Application of learning algorithms to image spam evolution

Shruti Wakade; Kathy J. Liszka; Chien Chung Chan

Journal Article

Application of learning algorithms to image spam evolution

Smart Innovation, Systems and Technologies (2013) 13 471-495

DOI: 10.1007/978-3-642-28699-5_18

9Citations

7Readers

Get full text

Abstract

Spam filters have become very proficient at identifying text spam, so spammers have developed different techniques to bypass filters. One such method is image spam, which first appeared in 2005 and quickly grew in popularity. KnujOn is a web site that collects and sorts spam for investigations and data analysis of emailbased threats. We have been collecting image spam from KnujOn on a daily basis since April 2008, culminating in a significantly large corpus of real data. In this chapter, we have identified eight features for the detection of computer generated image spam versus ham (non-spam). We use J48 and J48 with reduced error pruning decision trees to classify the images. Finally, we perform a validation by feature analysis on thirteen months of our corpus and observe that our classification scheme is not affected by changes made to images for the purpose of avoiding OCR detection. © Springer-Verlag Berlin Heidelberg 2013.

Author supplied keywords

Cite

CITATION STYLE

APA

Wakade, S., Liszka, K. J., & Chan, C. C. (2013). Application of learning algorithms to image spam evolution. Smart Innovation, Systems and Technologies, 13, 471–495. https://doi.org/10.1007/978-3-642-28699-5_18

Application of learning algorithms to image spam evolution

Abstract

Author supplied keywords

Cite

Register to see more suggestions