Subset labeled LDA: A topic model for extreme multi-label classification

4Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Labeled Latent Dirichlet Allocation (LLDA) is an extension of the standard unsupervised Latent Dirichlet Allocation (LDA) algorithm, to address multi-label learning tasks. Previous work has shown it to perform en par with other state-of-the-art multi-label methods. Nonetheless, with increasing number of labels LLDA encounters scalability issues. In this work, we introduce Subset LLDA, a topic model that extends the standard LLDA algorithm, that not only can efficiently scale up to problems with hundreds of thousands of labels but also improves over the LLDA state-of-the-art in terms of prediction accuracy. We conduct experiments on eight data sets, with labels ranging from hundreds to hundreds of thousands, comparing our proposed algorithm with the other LLDA algorithms (Prior–LDA, Dep–LDA), as well as the state-of-the-art in extreme multi-label classification. The results show a steady advantage of our method over the other LLDA algorithms and competitive results compared to the extreme multi-label classification algorithms.

Cite

CITATION STYLE

APA

Papanikolaou, Y., & Tsoumakas, G. (2018). Subset labeled LDA: A topic model for extreme multi-label classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11031 LNCS, pp. 152–162). Springer Verlag. https://doi.org/10.1007/978-3-319-98539-8_12

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free