Learning to Semantically Classify Email Messages

0Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

As a semantic vector space model for information retrieval (IR), Latent Semantic Indexing (LSI) employs singular value decomposition (SVD) to transform individual documents into the statistically derived semantic vectors. In this paper a new junk email (spam) filtering model, 2LSI-SF, is proposed and it is based on the augmented category LSI spaces and classifies email messages by their content. The model utilizes the valuable discriminative information in the training data and incorporates several pertinent feature selection and message classification algorithms. The experiments of 2LSI-SF on a benchmark spam testing corpus (PU1) and a newly compiled Chinese spam corpus (ZH1) have been conducted. The results from the experiments and performance comparison with the popular Support Vector Machines (SVM) and naïve Bayes classifiers have shown that 2LSI-SF is capable of filtering spam effectively.

Cite

CITATION STYLE

APA

Jiang, E. (2006). Learning to Semantically Classify Email Messages. In Lecture Notes in Control and Information Sciences (Vol. 344, pp. 700–711). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-540-37256-1_86

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free