Clustering high dimensional data using SVM

Tsau Young Lin; Tam Ngo

Conference Proceedings

Clustering high dimensional data using SVM

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2007) 4482 LNAI 256-262

DOI: 10.1007/978-3-540-72530-5_30

4Citations

15Readers

Get full text

Abstract

The Web contains massive amount of documents to the point where it has become impossible to classify them manually. This project's goal is to find a new method for clustering documents that is as close to humans' classification as possible and at the same time to reduce the size of the documents. This project uses a combination of Latent Semantic Indexing (LSI) with Singular Value Decomposition (SVD) calculation and Support Vector Machine (SVM) classification. Using SVD, data is decomposed and truncated to reduce the data size. The reduced data will be clustered into different categories. Using SVM, clustered data from SVD calculation is used for training to allow new data to be classified based on SVM's prediction. The project's result show that the method of combining SVD and SVM is able to reduce data size and classifies documents reasonably compared to humans' classification. © Springer-Verlag Berlin Heidelberg 2007.

Author supplied keywords

Cite

CITATION STYLE

APA

Lin, T. Y., & Ngo, T. (2007). Clustering high dimensional data using SVM. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4482 LNAI, pp. 256–262). Springer Verlag. https://doi.org/10.1007/978-3-540-72530-5_30

Clustering high dimensional data using SVM

Abstract

Author supplied keywords

Cite

Register to see more suggestions