Document classification method based on latent semantic indexing

Jeong Joon Kim; Yong Soo Lee; Jin Yong Moon; Jeong Min Park

Journal ArticleOPEN ACCESS

Document classification method based on latent semantic indexing

International Journal of Grid and Distributed Computing (2018) 11(4) 97-112

DOI: 10.14257/ijgdc.2018.11.4.09

0Citations

6Readers

Abstract

Among the studies, Latent Semantic Indexing and Non-negative Matrix Factorization, which are algorithms to classify the document by meaning, try solve the problems by converting the document to vector. However, there are 2 problems in these algorithms that the different understanding according to education document and the difficulties to analyze the multiple representations of the terms. Meanwhile, WordNet is a word dictionary interpreting the relationship of the words based on Human Intelligence Science and widely used in such as query term extension of the search engine. However, it is difficult to adapt to the neologism and slang and word meaning change to fast-changing time. Therefore, in this paper we solve the problem of the multiple representations of the words by partly applying the words relationship of the WordNet to Latent Semantic Indexing using by genetic algorithms for more efficient clustering document with the strength and weakness of the Latent Semantic Indexing and WordNet. And with this we try to improve precision and increase the efficiency of the overall clusters.

Author supplied keywords

Cite

CITATION STYLE

APA

Kim, J. J., Lee, Y. S., Moon, J. Y., & Park, J. M. (2018). Document classification method based on latent semantic indexing. International Journal of Grid and Distributed Computing, 11(4), 97–112. https://doi.org/10.14257/ijgdc.2018.11.4.09

Document classification method based on latent semantic indexing

Abstract

Author supplied keywords

Cite

Register to see more suggestions