Phishing sites detection from a web developer's perspective using machine learning

7Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

Abstract

The Internet has enabled unprecedented communication and new technologies. Concomitantly, it has brought the bane of phishing and exacerbated vulnerabilities. In this paper, we propose a model to detect phishing webpages from a web developer's perspective. From this standpoint, we design 120 novel features based on content from a webpage, four time-based and two search-based novel features, plus we use 34 other content-based and 11 heuristic features to optimize the model. Moreover, we select Random Committee (Base learner: Random Tree) for our framework since it has the best performance after comparing with six other algorithms: Hellinger Distance Decision Tree, SVM, Logistic Regression, J48, Naive Bayes, and Random Forest. In real-time experiments, the model achieved 99.4% precision and 98.3% MCC with 0.1% false positive rate in 5-fold crossvalidation using the realistic scenario of an unbalanced dataset.

Cite

CITATION STYLE

APA

Zhou, X., & Verma, R. M. (2020). Phishing sites detection from a web developer’s perspective using machine learning. In Proceedings of the Annual Hawaii International Conference on System Sciences (Vol. 2020-January, pp. 6486–6495). IEEE Computer Society. https://doi.org/10.24251/hicss.2020.794

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free