Ontology-based Text Document Clustering.

  • Staab S
  • Hotho A
N/ACitations
Citations of this article
99Readers
Mendeley users who have this article in their library.

Abstract

Text clustering typically involves clustering in a high dimensional space, which appears difficult with regard to virtually all practical settings. In addition, given a particular clustering result it is typically very hard to come up with a good explanation of why the text clusters have been constructed the way they are. In this paper, we propose a new approach for applying background knowledge during preprocessing in order to improve clustering results and allow for selection between results. We preprocess our input data applying an ontology-based heuristics for feature selection and feature aggregation. Thus, we construct a number of alternative text representations. Based on these representations, we compute multiple clustering results using K- Means. The results may be distinguished and explained by the corresponding selection of concepts in the ontology. Our results compare favourably with a sophisticated baseline preprocessing strategy.

Author supplied keywords

Cite

CITATION STYLE

APA

Staab, S., & Hotho, A. (2003). Ontology-based Text Document Clustering. In Intelligent Information Processing and Web Mining, Proceedings of the International IIS: IIPWM’03 Conference held in Zakopane (pp. 451–452). Retrieved from http://dblp.uni-trier.de/db/conf/iis/iis2003.html#StaabH03

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free