Finding better topics: Features, priors and constraints

Xiaona Wu; Jia Zeng; Jianfeng Yan; Xiaosheng Liu

Conference Proceedings

Finding better topics: Features, priors and constraints

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8444 LNAI(PART 2) 296-310

DOI: 10.1007/978-3-319-06605-9_25

3Citations

4Readers

Get full text

Abstract

Latent Dirichlet allocation (LDA) is a popular probabilistic topic modeling paradigm. In practice, LDA users usually face two problems. First, the common and stop words tend to occupy all topics leading to bad topic interpretability. Second, there is little guidance on how to improve the low-dimensional topic features for a better clustering or classification performance. To find better topics, we re-examine LDA from three perspectives: continuous features, asymmetric Dirichlet priors and sparseness constraints, using variants of belief propagation (BP) inference algorithms. We show that continuous features can remove the common and stop words from topics effectively. Asymmetric Dirichlet priors have substantial advantages over symmetric priors. Sparseness constraints do not improve the overall performance very much. © 2014 Springer International Publishing.

Author supplied keywords

Cite

CITATION STYLE

APA

Wu, X., Zeng, J., Yan, J., & Liu, X. (2014). Finding better topics: Features, priors and constraints. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8444 LNAI, pp. 296–310). Springer Verlag. https://doi.org/10.1007/978-3-319-06605-9_25

Finding better topics: Features, priors and constraints

Abstract

Author supplied keywords

Cite

Register to see more suggestions