Low Time Complexity Model for Email Spam Detection using Logistic Regression

Zubeda K. Mrisho; Anael Elkana Sam; Jema David Ndibwile

Journal ArticleOPEN ACCESS

Low Time Complexity Model for Email Spam Detection using Logistic Regression

International Journal of Advanced Computer Science and Applications (2021) 12(12) 112-118

DOI: 10.14569/IJACSA.2021.0121215

3Citations

15Readers

Abstract

Spam emails have recently become a concern on the Internet. Machine learning techniques such as Neural Networks, Naïve Bayes, and Decision Trees have frequently been used to combat these spam emails. Despite their efficiency, time complexity in high-dimensional datasets remains a significant challenge. Due to a large number of features in high-dimensional datasets, the intricacy of this problem grows exponentially. The existing approaches suffer from a computational burden when thousands of features are used (high-time complexity). To reduce time complexity and improve accuracy in high-dimensional datasets, extra steps of feature selection and parameter tuning are necessary. This work recommends the use of a hybrid logistic regression model with a feature selection approach and parameter tuning that could effectively handle a big dimensional dataset. The model employs the Term Frequency-Inverse Document Frequency (TF-IDF) feature extraction method to mitigate the drawbacks of Term Frequency (TF) to obtain an equal feature weight. Using publicly available datasets (Enron and Lingspam), we compared the model’s performance to that of other contemporary models. The proposed model achieved a low level of time complexity while maintaining a high level of spam detection rate of 99.1%.

Author supplied keywords

Cite

CITATION STYLE

APA

Mrisho, Z. K., Sam, A. E., & Ndibwile, J. D. (2021). Low Time Complexity Model for Email Spam Detection using Logistic Regression. International Journal of Advanced Computer Science and Applications, 12(12), 112–118. https://doi.org/10.14569/IJACSA.2021.0121215

Low Time Complexity Model for Email Spam Detection using Logistic Regression

Abstract

Author supplied keywords

Cite

Register to see more suggestions