Abstract
— The expansion of the internet leads to an increase in the number of cyber-attacks over the days. One of the most common cybersecurity attacks is social engineering, which depends on human physiology. The phishing attack is the most popular form of social engineering. The phishing attacks have many forms, but the traditional one from them is the messages. We need techniques to protect us from these attacks. Awareness, usage policies, and other procedures are not enough. Therefore, we proposed to use natural language processing (NLP) along with machine learning techniques for text phishing detection in this paper. We started with 6,224 emails from an existing dataset that contains both phishing and legitimate emails. NLP was used for preparing the data before extracting features from it and using the features for training the classification models by machine learning algorithm and for testing these models. The features were extracted using Continuous Bag of Words (CBOW) in the Word2Vec algorithm. We training four models using four different machine learning algorithms which are k-nearest neighbors (KNN), Multinomial Naive Bayes (MNB), Decision Tree and AdaBoost. The developed models had to classify the text messages into two categories, which are phishing and legitimate. While the dataset is unbalanced, we used performance measurements for unbalanced data in the evaluation process. Three of our models, that were trained by KNN, Decision Tree and AdaBoost algorithms, obtained considerable values while the MNB model obtained an insignificant value.
Author supplied keywords
Cite
CITATION STYLE
Alsufyani, A. A., & Alzahrani, S. M. (2021). Social engineering attack detection using machine learning: Text phishing attack. Indian Journal of Computer Science and Engineering, 12(3), 743–751. https://doi.org/10.21817/indjcse/2021/v12i3/211203298
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.