The Performance Benchmark of Decision Tree Algorithms for Spam e-mail Detection

Eyüp Akçetin; Ufuk Çelik

Journal ArticleOPEN ACCESS

The Performance Benchmark of Decision Tree Algorithms for Spam e-mail Detection

Akçetin E
Çelik U

Journal of Internet Applications and Management (2014) 5(2) 43-56

DOI: 10.5505/iuyd.2014.43531

N/ACitations

11Readers

Abstract

The objective of this study is to determine the most convenient decision tree method in terms of accuracy and classification built time by comparing the performance of decision tree algorithms with the purpose of identifying the spam e-mails. The data were gathered from one of the datasets of University of California machine learning datasets including 4601 e-mails for the classification of spam. The spam e-mails were classified utilizing 10 fold cross validation by using WEKA machine learning software involving 12 different decision trees. The performance of this classification was found by implementing the principle component analysis. It was found that the performance of decision trees on determining spam e-mails showed accuracy rate ranging between 91% and 94.68%.Random Forest algorithm was found to be the best classifier with the accuracy rate of 94.68%. It was understood that this algorithm can classify spam e-mails quickly in a hectic e-mail exchange system because the classification built time of the algorithm is 2.11 seconds for the 4601 e-mails.

Cite

CITATION STYLE

APA

Akçetin, E., & Çelik, U. (2014). The Performance Benchmark of Decision Tree Algorithms for Spam e-mail Detection. Journal of Internet Applications and Management, 5(2), 43–56. https://doi.org/10.5505/iuyd.2014.43531

The Performance Benchmark of Decision Tree Algorithms for Spam e-mail Detection

Abstract

Cite

Register to see more suggestions