Diagnosing and improving topic models by analyzing posterior variability

11Citations
Citations of this article
31Readers
Mendeley users who have this article in their library.

Abstract

Bayesian inference methods for probabilistic topic models can quantify uncertainty in the parameters, which has primarily been used to increase the robustness of parameter estimates. In this work, we explore other rich information that can be obtained by analyzing the posterior distributions in topic models. Experimenting with latent Dirichlet allocation on two datasets, we propose ideas incorporating information about the posterior distributions at the topic level and at the word level. At the topic level, we propose a metric called topic stability that measures the variability of the topic parameters under the posterior. We show that this metric is correlated with human judgments of topic quality as well as with the consistency of topics appearing across multiple models. At the word level, we experiment with different methods for adjusting individual word probabilities within topics based on their uncertainty. Humans prefer words ranked by our adjusted estimates nearly twice as often when compared to the traditional approach. Finally, we describe how the ideas presented in this work could potentially applied to other predictive or exploratory models in future work.

Cite

CITATION STYLE

APA

Xing, L., & Paul, M. J. (2018). Diagnosing and improving topic models by analyzing posterior variability. In 32nd AAAI Conference on Artificial Intelligence, AAAI 2018 (pp. 6005–6012). AAAI press. https://doi.org/10.1609/aaai.v32i1.12033

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free