Evaluation of path based methods for conceptual representation of the text

1Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Typical text clustering methods use the bag of words (BoW) representation to describe content of documents. However, this method is known to have several limitations. Employing Wikipedia as the lexical knowledge base has shown an improvement of the text representation for data-mining purposes. Promising extensions of that trend employ hierarchical organization of Wikipedia category system. In this paper we propose three path-based measures for calculating document relatedness in such conceptual space and compare them with the Path Length widely used approach. We perform their evaluation using the OPTICS clustering algorithm for categorization of keyword-based search results. The results have shown that our method outperforms the Path-Length approach. © 2014 Springer International Publishing.

Cite

CITATION STYLE

APA

Kucharczyk, Ł., & Szymański, J. (2014). Evaluation of path based methods for conceptual representation of the text. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8502 LNAI, pp. 435–444). Springer Verlag. https://doi.org/10.1007/978-3-319-08326-1_44

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free