Validation of text clustering based on document contents

Jarmo Toivonen; Ari Visa; Tomi Vesanen; Barbro Back; Hannu Vanharanta

Conference Proceedings

Validation of text clustering based on document contents

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2001) 2123 LNAI 184-195

DOI: 10.1007/3-540-44596-x_15

2Citations

6Readers

Get full text

Abstract

In this paper some results of a new text clustering methodology are presented. A prototype is an interesting document or a part of an extracted, interesting text. The given prototype is matched with the existing document database or the monitored document flow. Our claim is that the new methodology is capable of automatic content-based clustering using the information of the document. To verify this hypothesis an experiment was designed with the Bible. Four different translations, one Greek, one Latin, and two Finnish translations from years 1933/38 and 1992 were selected as test text material. Validation experiments were performed with a designed prototype version of the software application. © Springer-Verlag Berlin Heidelberg 2001.

Cite

CITATION STYLE

APA

Toivonen, J., Visa, A., Vesanen, T., Back, B., & Vanharanta, H. (2001). Validation of text clustering based on document contents. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2123 LNAI, pp. 184–195). Springer Verlag. https://doi.org/10.1007/3-540-44596-x_15

Validation of text clustering based on document contents

Abstract

Cite

Register to see more suggestions