We propose a method based on restricted random walk clustering as a (semi-)automated complement for the tedious, error-prone and expensive task of manual indexing in a scientific library. The first stage of our method is to cluster a set of (partially) indexed documents using restricted random walks on usage histories in order to find groups of similar documents. In the second stage, we derive possible keywords for documents without indexing information from the frequencies of keywords assigned to other documents in their respective cluster. Due to the specific clustering algorithm, the proposed algorithm is still efficient with millions of documents and can be deployed on standard PC hardware. © Springer-Verlag Berlin Heidelberg 2004.
CITATION STYLE
Franke, M., & Geyer-Schulz, A. (2004). Automated indexing with restricted random walks on large document sets. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3232, 232–243. https://doi.org/10.1007/978-3-540-30230-8_22
Mendeley helps you to discover research relevant for your work.