Comparison of latent Dirichlet modeling and factor analysis for topic extraction: A lesson of history

23Citations
Citations of this article
47Readers
Mendeley users who have this article in their library.

Abstract

Topic modeling is often perceived as a relatively new development in information retrieval sciences, and new methods such as Probabilistic Latent Semantic Analysis and Latent Dirichlet Allocation have generated a lot of research. However, attempts to extract topics from unstructured text using Factor Analysis techniques can be found as early as the 1960s. This paper compares the perceived coherence of topics extracted on three different datasets using Factor Analysis and Latent Dirichlet Allocation. To perform such a comparison a new extrinsic evaluation method is proposed. Results suggest that Factor Analysis can produce topics perceived by human coders as more coherent than Latent Dirichlet Allocation and warrant a revisit of a topic extraction method developed more than fifty-five years ago, yet forgotten.

Cite

CITATION STYLE

APA

Péladeau, N., & Davoodi, E. (2018). Comparison of latent Dirichlet modeling and factor analysis for topic extraction: A lesson of history. In Proceedings of the Annual Hawaii International Conference on System Sciences (Vol. 2018-January, pp. 615–623). IEEE Computer Society. https://doi.org/10.24251/hicss.2018.078

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free