Usually, text documents are represented as a vector of n-dimensional Euclidean space. One of the main it the problem of the typology of texts using cluster analysis is to determine the number of clusters. In this article was researched the agglomerative clustering algorithm in Euclidean space. A statistical criterion for completing the clustering process was deriving as the Markov moment. Was considered the problem of cluster stability. As an example, it was considered retrieval of the harmful content.
CITATION STYLE
Orekhov, A. V. (2019). Agglomerative method for texts clustering. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11551 LNCS, pp. 19–32). Springer Verlag. https://doi.org/10.1007/978-3-030-17705-8_2
Mendeley helps you to discover research relevant for your work.