Regularization for unsupervised classification on taxonomies

1Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We study unsupervised classification of text documents into a taxonomy of concepts annotated by only a few keywords. Our central claim is that the structure of the taxonomy encapsulates background knowledge that can be exploited to improve classification accuracy. Under our hierarchical Dirichlet generative model for the document corpus, we show that the unsupervised classification algorithm provides robust estimates of the classification parameters by performing regularization, and that our algorithm can be interpreted as a regularized EM algorithm. We also propose a technique for the automatic choice of the regularization parameter. In addition we propose a regularization scheme for K-means for hierarchies. We experimentally demonstrate that both our regularized clustering algorithms achieve a higher classification accuracy over simple models like minimum distance, Naïve Bayes, EM and K-means. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Sona, D., Veeramachaneni, S., Polettini, N., & Avesani, P. (2006). Regularization for unsupervised classification on taxonomies. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4203 LNAI, pp. 691–696). Springer Verlag. https://doi.org/10.1007/11875604_76

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free