Relevance of contextual information in compression-based text clustering

2Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper we take a step towards understanding compression distances by analyzing the relevance of contextual information in compression-based text clustering. In order to do so, two kinds of word removal are explored, one that maintains part of the contextual information despite the removal, and one that does not maintain it. We show how removing words in such a way that the contextual information is maintained despite the word removal helps the compression-based text clustering and improves its accuracy, while on the contrary, removing words losing that contextual information makes the clustering results worse. © 2010 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Granados, A., Martínez, R., Camacho, D., & De Borja Rodríguez, F. (2010). Relevance of contextual information in compression-based text clustering. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6283 LNCS, pp. 259–266). https://doi.org/10.1007/978-3-642-15381-5_32

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free