Error correcting codes with optimized kullback-leibler distances for text categorization

Jörg Kindermann; Gerhard Paass; Edda Leopold

Journal ArticleOPEN ACCESS

Error correcting codes with optimized kullback-leibler distances for text categorization

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2001) 2168 266-276

DOI: 10.1007/3-540-44794-6_22

7Citations

9Readers

Abstract

We extend a multi-class categorization scheme proposed by Dietterich and Bakiri 1995 for binary classifiers, using error correcting codes. The extension comprises the computation of the codes by a simulated annealing algorithm and optimization of Kullback-Leibler (KL) category distances within the code-words. For the first time, we apply the scheme to text categorization with support vector machines (SVMs) on several large text corpora with more than 100 categories. The results are compared to 1-of-N coding (i. e. one SVM for each text category). We also investigate codes with optimized KL distance between the text categories which are merged in the code-words. We find that error correcting codes perform better than 1-of-N coding with increasing code length. For very long codes, the performance is in some cases further improved by KL-distance optimization. © Springer-Verlag Berlin Heidelberg 2001.

Cite

CITATION STYLE

APA

Kindermann, J., Paass, G., & Leopold, E. (2001). Error correcting codes with optimized kullback-leibler distances for text categorization. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2168, 266–276. https://doi.org/10.1007/3-540-44794-6_22

Error correcting codes with optimized kullback-leibler distances for text categorization

Abstract

Cite

Register to see more suggestions