The colossal growth of volatile online text data evokes the demand for automatic text analysis tools to identify worthwhile information. Documents, as well as text streams, can be structured beyond the concept of frequency distributions. Here we introduce a novel method that provides a relative measure for information value over a time series that is mapped by a dynamic trie structure. We adapt the concept of entropy for textual data and employ a compression-based estimation method. The algorithm can perform in a real-time scenario because of its linear complexity and since it is based on a dynamic history of predefined size. We show the suitability of our method with an experimental dataset and compare our results to an existing approach. Our results reveal structural properties of the texts and permit for deeper analysis of the presumably information peaks. © 2013 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Bohne, T., & Borghoff, U. M. (2013). Detecting information structures in texts. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8112 LNCS, pp. 467–474). Springer Verlag. https://doi.org/10.1007/978-3-642-53862-9_59
Mendeley helps you to discover research relevant for your work.