A self-training method for detection of phishing websites

Xue Peng Jia; Xiao Feng Rong

Conference Proceedings

A self-training method for detection of phishing websites

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 10943 LNCS 414-425

DOI: 10.1007/978-3-319-93803-5_39

0Citations

15Readers

Get full text

Abstract

Phishing detection based on machine learning always lacks training data with high confidence labels. In order to reduce the impact of lack of labels on training set on performance to phishing detection, this paper proposes an improved self-training method of semi-supervised learning. It uses the divide-and-conquer principle and decomposes the original problem into a number of smaller but similar sub-problems to the original one. We compare model classification quality among supervised learning, traditional semi-supervised learning and new proposal method by using four classifiers, as well as the running time between two kinds of semi-supervised methods. The running time of can be reduced by 50% by using the improve method which divides unlabeled dataset equally, on the basis of ensuring the classification effect is equal to the traditional self-training method. Furthermore, the running time of model is continue reducing significantly by increasing the number of dividing unlabeled data set. The experiments results show our proposal, the improved self-training method outperformed the traditional self-training method.

Author supplied keywords

Cite

CITATION STYLE

APA

Jia, X. P., & Rong, X. F. (2018). A self-training method for detection of phishing websites. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10943 LNCS, pp. 414–425). Springer Verlag. https://doi.org/10.1007/978-3-319-93803-5_39

A self-training method for detection of phishing websites

Abstract

Author supplied keywords

Cite

Register to see more suggestions