Centroid-Based Library Management and Document Clustering

Mario Kubek

Book Chapter

Centroid-Based Library Management and Document Clustering

Kubek M

Springer Science and Business Media Deutschland GmbH, (2020), 103-116

DOI: 10.1007/978-3-030-23136-1_7

0Citations

4Readers

Get full text

Abstract

Based on the previous fundamentals and findings, in this chapter, hierarchic and centroid-based document management algorithms will now be presented, which are inspired by the way of how human librarians classify, sort and catalogue incoming documents. For this purpose, they calculate the distance of centroid terms in a local co-occurrence graph as a metric to determine the documents’ semantic closeness, to generate (sub)clusters of documents and to assign them to processing nodes (child nodes which are created for this purpose on-the-fly). These centroid-based library management and clustering algorithms are designed to run decentrally on peers (the librarians) of a P2P-network. Furthermore, this approach is equally used to classify and answer incoming queries as well as to route and forward them to semantically matching child nodes.

Cite

CITATION STYLE

APA

Kubek, M. (2020). Centroid-Based Library Management and Document Clustering. In Studies in Big Data (Vol. 62, pp. 103–116). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-23136-1_7

Centroid-Based Library Management and Document Clustering

Abstract

Cite

Register to see more suggestions