Robust initialization for learning latent dirichlet allocation

2Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Latent Dirichlet Allocation (LDA) represents perhaps the most famous topic model, employed in many different contexts in Computer Science. The wide success of LDA is due to the effectiveness of this model in dealing with large datasets, the competitive performances obtained on several tasks (e.g. classification, clustering), and the interpretability of the solution provided. Learning the LDA from training data usually requires to employ iterative optimization techniques such as the Expectation-Maximization, for which the choice of a good initialization is of crucial importance to reach an optimal solution. However, even if some clever solutions have been proposed, in practical applications this issue is typically disregarded, and the usual solution is to resort to random initialization. In this paper we address the problem of initializing the LDA model with two novel strategies: the key idea is to perform a repeated learning by employ a topic splitting/pruning strategy, such that each learning phase is initialized with an informative situation derived from the previous phase. The performances of the proposed splitting and pruning strategies have been assessed from a twofold perspective: i) the log-likelihood of the learned model (both on the training set and on a held-out set); ii) the coherence of the learned topics. The evaluation has been carried out on five different datasets, taken from and heterogeneous contexts in the literature, showing promising results.

Cite

CITATION STYLE

APA

Lovato, P., Bicego, M., Murino, V., & Perina, A. (2015). Robust initialization for learning latent dirichlet allocation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9370, pp. 117–131). Springer Verlag. https://doi.org/10.1007/978-3-319-24261-3_10

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free