Analyzing the temporal sequences for text categorization

Xiao Luo; A. Nur Zincir-Heywood

Conference Proceedings

Analyzing the temporal sequences for text categorization

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2004) 3215 498-505

DOI: 10.1007/978-3-540-30134-9_67

5Citations

11Readers

Get full text

Abstract

This paper describes a text categorization approach that is based on a combination of a newly designed text representation with a kNN classifier. The new text document representation explored here is based an unsupervised learning mechanism – a hierarchical structure of Self-Organizing Feature Maps. Through this architecture, a document can be encoded to a sequence of neurons and the corresponding distances to the neurons, while the temporal sequences of words as well as their frequencies are kept. Combining this representation with the power of kNN classifier achieved a good performance (Micro average F1- measure 0.855) on the experimental data set. It shows that this architecture can capture the characteristic temporal sequences of documents/categories which can be used for various text categorization and clustering tasks.

Cite

CITATION STYLE

APA

Luo, X., & Zincir-Heywood, A. N. (2004). Analyzing the temporal sequences for text categorization. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3215, pp. 498–505). Springer Verlag. https://doi.org/10.1007/978-3-540-30134-9_67

Analyzing the temporal sequences for text categorization

Abstract

Cite

Register to see more suggestions