The Internet has enabled unprecedented communication and new technologies. Concomitantly, it has brought the bane of phishing and exacerbated vulnerabilities. In this paper, we propose a model to detect phishing webpages from a web developer's perspective. From this standpoint, we design 120 novel features based on content from a webpage, four time-based and two search-based novel features, plus we use 34 other content-based and 11 heuristic features to optimize the model. Moreover, we select Random Committee (Base learner: Random Tree) for our framework since it has the best performance after comparing with six other algorithms: Hellinger Distance Decision Tree, SVM, Logistic Regression, J48, Naive Bayes, and Random Forest. In real-time experiments, the model achieved 99.4% precision and 98.3% MCC with 0.1% false positive rate in 5-fold crossvalidation using the realistic scenario of an unbalanced dataset.
CITATION STYLE
Zhou, X., & Verma, R. M. (2020). Phishing sites detection from a web developer’s perspective using machine learning. In Proceedings of the Annual Hawaii International Conference on System Sciences (Vol. 2020-January, pp. 6486–6495). IEEE Computer Society. https://doi.org/10.24251/hicss.2020.794
Mendeley helps you to discover research relevant for your work.