Stochastic divergence minimization for biterm topic models

Zhenghang Cui; Issei Sato; Masashi Sugiyama

Journal ArticleOPEN ACCESS

Stochastic divergence minimization for biterm topic models

IEICE Transactions on Information and Systems (2018) E101D(3) 668-677

DOI: 10.1587/transinf.2017EDP7310

0Citations

14Readers

Abstract

As the emergence and the thriving development of social networks, a huge number of short texts are accumulated and need to be processed. Inferring latent topics of collected short texts is an essential task for understanding its hidden structure and predicting new contents. A biterm topic model (BTM) was recently proposed for short texts to overcome the sparseness of document-level word co-occurrences by directly modeling the generation process of word pairs. Stochastic inference algorithms based on collapsed Gibbs sampling (CGS) and collapsed variational inference have been proposed for BTM. However, they either require large computational complexity, or rely on very crude estimation that does not preserve sufficient statistics. In this work, we develop a stochastic divergence minimization (SDM) inference algorithm for BTM to achieve better predictive likelihood in a scalable way. Experiments show that SDM-BTM trained by 30% data outperforms the best existing algorithm trained by full data.

Author supplied keywords

Cite

CITATION STYLE

APA

Cui, Z., Sato, I., & Sugiyama, M. (2018). Stochastic divergence minimization for biterm topic models. IEICE Transactions on Information and Systems, E101D(3), 668–677. https://doi.org/10.1587/transinf.2017EDP7310

Stochastic divergence minimization for biterm topic models

Abstract

Author supplied keywords

Cite

Register to see more suggestions