With the increasing amount of information available in electronic document collections, methods for organizing these collections to allow topic-oriented browsing and orientation gain increasing importance. The SOMLib digital library system provides such an organization based on the Self-Organizing Map, a popular neural network model by producing a map of the document space. However, hierarchical relations between documents are hidden in the display. Moreover, with increasing size of document archives the required maps grow larger, thus leading to problems for the user in finding proper orientation within the map. In this case, a hierarchically structured representation of the document space would be highly preferable. In this paper, we present the Growing Hierarchical Self-Organizing Map, a dynamically growing neural network model, providing a content-based hierarchical decomposition and organization of document spaces. This architecture evolves into a hierarchical structure according to the requisites of the input data during an unsupervised training process. A recent enhancement of the training process further ensures proper orientation of the various topical partitions. This facilitates intuitive navigation between neighboring topical branches. The benefits of this approach are shown by organizing a real-world document collection according to semantic similarities.
CITATION STYLE
Dittenbach, M., Rauber, A., & Merkl, D. (2001). Business, culture, politics, and sports– How to find your way through a bulk of news?: On content-based hierarchical structuring and organization of large document archives. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2113, pp. 200–210). Springer Verlag. https://doi.org/10.1007/3-540-44759-8_21
Mendeley helps you to discover research relevant for your work.