Real-Time and Zero-Footprint Bag of Synthetic Syllables Algorithm for E-mail Spam Detection Using Subject Line and Short Text Fields

1Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Contemporary e-mail services have high availability expectations from the customers and are resource-strained because of the high-volume throughput and spam attacks. Deep machine learning architectures, which are resource hungry and require offline processing due to the long processing times, are not accepted at the front-line filters. On the other hand, the bulk of the incoming spam is not sophisticated enough to bypass even the simplest algorithms. While the small fraction of the intelligent, highly mutable spam can be detected only by the deep architectures, the stress on them can be unloaded by the simple near real-time and near zero-footprint algorithms such as the bag of synthetic syllables algorithm applied to the short texts of the e-mail subject lines and other short text fields. The proposed algorithm creates a circa 200 sparse dimensional hash or vector for each e-mail subject line that can be compared for the cosine or Euclidean proximity distance to find similarities to the known spammy subjects. The algorithm does not require any persistent storage, dictionaries, additional hardware upgrades or software packages. The performance of the algorithm is presented on the one day of the real SMTP traffic.

Cite

CITATION STYLE

APA

Selitskiy, S. (2023). Real-Time and Zero-Footprint Bag of Synthetic Syllables Algorithm for E-mail Spam Detection Using Subject Line and Short Text Fields. In Lecture Notes in Networks and Systems (Vol. 448, pp. 257–265). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-19-1610-6_22

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free