Abstract
Among the studies, Latent Semantic Indexing and Non-negative Matrix Factorization, which are algorithms to classify the document by meaning, try solve the problems by converting the document to vector. However, there are 2 problems in these algorithms that the different understanding according to education document and the difficulties to analyze the multiple representations of the terms. Meanwhile, WordNet is a word dictionary interpreting the relationship of the words based on Human Intelligence Science and widely used in such as query term extension of the search engine. However, it is difficult to adapt to the neologism and slang and word meaning change to fast-changing time. Therefore, in this paper we solve the problem of the multiple representations of the words by partly applying the words relationship of the WordNet to Latent Semantic Indexing using by genetic algorithms for more efficient clustering document with the strength and weakness of the Latent Semantic Indexing and WordNet. And with this we try to improve precision and increase the efficiency of the overall clusters.
Author supplied keywords
Cite
CITATION STYLE
Kim, J. J., Lee, Y. S., Moon, J. Y., & Park, J. M. (2018). Document classification method based on latent semantic indexing. International Journal of Grid and Distributed Computing, 11(4), 97–112. https://doi.org/10.14257/ijgdc.2018.11.4.09
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.