A new dimensionality reduction technique based on HMM for boosting document classification

A. Seara Vieira; E. L. Iglesias; L. Borrajo

Conference Proceedings

A new dimensionality reduction technique based on HMM for boosting document classification

Advances in Intelligent Systems and Computing (2015) 375 69-77

DOI: 10.1007/978-3-319-19776-0_8

1Citations

5Readers

Get full text

Abstract

Many classification problems, such as text classification, require the ability to handle the high dimension of a structured representation of the documents. The enormous size of the data would result in burdensome computations. Consequently, there is a strong need for reducing the quantity of handled information to develop the classification process. In this paper, we propose a dimensionality reduction technique on text datasets based on a clustering method to group documents with a simple Hidden Markov Model to represent them. We have applied the new method on the OHSUMED benchmark text corpora using the k-NN and SVM classifiers. The results obtained are very satisfactory and demonstrate the suitability of the proposed technique for the problem of dimensionality reduction and document classification.

Author supplied keywords

Cite

CITATION STYLE

APA

Vieira, A. S., Iglesias, E. L., & Borrajo, L. (2015). A new dimensionality reduction technique based on HMM for boosting document classification. In Advances in Intelligent Systems and Computing (Vol. 375, pp. 69–77). Springer Verlag. https://doi.org/10.1007/978-3-319-19776-0_8

A new dimensionality reduction technique based on HMM for boosting document classification

Abstract

Author supplied keywords

Cite

Register to see more suggestions