Latent Dirichlet allocation (LDA) is a popular probabilistic topic modeling paradigm. In practice, LDA users usually face two problems. First, the common and stop words tend to occupy all topics leading to bad topic interpretability. Second, there is little guidance on how to improve the low-dimensional topic features for a better clustering or classification performance. To find better topics, we re-examine LDA from three perspectives: continuous features, asymmetric Dirichlet priors and sparseness constraints, using variants of belief propagation (BP) inference algorithms. We show that continuous features can remove the common and stop words from topics effectively. Asymmetric Dirichlet priors have substantial advantages over symmetric priors. Sparseness constraints do not improve the overall performance very much. © 2014 Springer International Publishing.
CITATION STYLE
Wu, X., Zeng, J., Yan, J., & Liu, X. (2014). Finding better topics: Features, priors and constraints. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8444 LNAI, pp. 296–310). Springer Verlag. https://doi.org/10.1007/978-3-319-06605-9_25
Mendeley helps you to discover research relevant for your work.