A statistics–based semantic relation analysis approach for document clustering

0Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Document clustering is a widely research topic in the area of machine learning. A number of approaches have been proposed to represent and cluster documents. One of the recent trends in document clustering research is to incorporate the semantic information into document representation. In this paper, we introduce a novel technique for capturing the robust and reliable semantic information from term-term co-occurrence statistics. Firstly, we propose a novel method to evaluate the explicit semantic relation between terms from their cooccurrence information. Then the underlying semantic relation between terms is also captured by their interaction with other terms. Lastly, these two complementary semantic relations are integrated together to capture the complete semantic information from the original documents. Experimental results show that clustering performance improves significantly by enriching document representation with the semantic information.

Cite

CITATION STYLE

APA

Cheng, X., Miao, D. Q., & Wang, L. (2014). A statistics–based semantic relation analysis approach for document clustering. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8818, pp. 332–342). Springer Verlag. https://doi.org/10.1007/978-3-319-11740-9_31

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free