Computational topology in text mining

Hubert Wagner; Paweł Dłotko; Marian Mrozek

Conference ProceedingsOPEN ACCESS

Computational topology in text mining

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7309 LNCS 68-78

DOI: 10.1007/978-3-642-30238-1_8

29Citations

29Readers

Abstract

In this paper we present our ongoing research on applying computational topology to analysis of structure of similarities within a collection of text documents. Our work is on the fringe between text mining and computational topology, and we describe techniques from each of these disciplines. We transform text documents to the so-called vector space model, which is often used in text mining. This representation is suitable for topological computations. We compute homology, using discrete Morse theory, and persistent homology of the Flag complex built from the point-cloud representing the input data. Since the space is high-dimensional, many difficulties appear. We describe how we tackle these problems and point out what challenges are still to be solved. © 2012 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Wagner, H., Dłotko, P., & Mrozek, M. (2012). Computational topology in text mining. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7309 LNCS, pp. 68–78). https://doi.org/10.1007/978-3-642-30238-1_8

Computational topology in text mining

Abstract

Author supplied keywords

Cite

Register to see more suggestions