Error correcting codes with optimized kullback-leibler distances for text categorization

7Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

We extend a multi-class categorization scheme proposed by Dietterich and Bakiri 1995 for binary classifiers, using error correcting codes. The extension comprises the computation of the codes by a simulated annealing algorithm and optimization of Kullback-Leibler (KL) category distances within the code-words. For the first time, we apply the scheme to text categorization with support vector machines (SVMs) on several large text corpora with more than 100 categories. The results are compared to 1-of-N coding (i. e. one SVM for each text category). We also investigate codes with optimized KL distance between the text categories which are merged in the code-words. We find that error correcting codes perform better than 1-of-N coding with increasing code length. For very long codes, the performance is in some cases further improved by KL-distance optimization. © Springer-Verlag Berlin Heidelberg 2001.

Cite

CITATION STYLE

APA

Kindermann, J., Paass, G., & Leopold, E. (2001). Error correcting codes with optimized kullback-leibler distances for text categorization. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2168, 266–276. https://doi.org/10.1007/3-540-44794-6_22

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free