Comparing document classification schemes using K-means clustering

Artur Šilić; Marie Francine Moens; Lovro Žmak; Bojana Dalbelo Bašić

Conference Proceedings

Comparing document classification schemes using K-means clustering

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2008) 5177 LNAI(PART 1) 615-624

DOI: 10.1007/978-3-540-85563-7_78

7Citations

14Readers

Get full text

Abstract

In this work, we jointly apply several text mining methods to a corpus of legal documents in order to compare the separation quality of two inherently different document classification schemes. The classification schemes are compared with the clusters produced by the K-means algorithm. In the future, we believe that our comparison method will be coupled with semi-supervised and active learning techniques. Also, this paper presents the idea of combining K-means and Principal Component Analysis for cluster visualization. The described idea allows calculations to be performed in reasonable amount of CPU time. © 2008 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Šilić, A., Moens, M. F., Žmak, L., & Bašić, B. D. (2008). Comparing document classification schemes using K-means clustering. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5177 LNAI, pp. 615–624). Springer Verlag. https://doi.org/10.1007/978-3-540-85563-7_78

Comparing document classification schemes using K-means clustering

Abstract

Cite

Register to see more suggestions