Phishing is a major problem that involves web sites and fraudulent emails that aim to reveal users important information such as financial data, emails, and other private information. Phishing activities have been in the increasing trend, and many unsuspecting users have fallen victims of these websites and fraudulent emails. This paper has analyzed the evaluation and design of the features used to detect and reduce any false activity. The selected features not only depend on the characteristics of the URL (Uniform Resource Locator), but also on the website content. The TF-IDF algorithm is used to calculate the top keywords of the website content that is used to extract one of the important features. The technique was evaluated on the dataset of 4.420 legitimate URLs and 5.389 phishing URLs. By considering features and evaluating using 5 classification algorithms, the resulting classifiers obtain 98.8 % accuracy on detecting phishing website URLs.
CITATION STYLE
Nguyen, H. H., & Nguyen, D. T. (2016). Machine learning based phishing web sites detection. In Lecture Notes in Electrical Engineering (Vol. 371, pp. 123–131). Springer Verlag. https://doi.org/10.1007/978-3-319-27247-4_11
Mendeley helps you to discover research relevant for your work.