A semantic cover approach for topic modeling

5Citations
Citations of this article
72Readers
Mendeley users who have this article in their library.

Abstract

We introduce a novel topic modeling approach based on constructing a semantic set cover for clusters of similar documents. Specifically, our approach first clusters documents using their Tf-Idf representation, and then covers each cluster with a set of topic words based on semantic similarity, defined in terms of a word embedding. Computing a topic cover amounts to solving a minimum set cover problem. Our evaluation compares our topic modeling approach to Latent Dirichlet Allocation (LDA) on three metrics: 1) qualitative topic match, measured using evaluations by Amazon Mechanical Turk (MTurk) workers, 2) performance on classification tasks using each topic model as a sparse feature representation, and 3) topic coherence. We find that qualitative judgments significantly favor our approach, the method outperforms LDA on topic coherence, and is comparable to LDA on document classification tasks.

Cite

CITATION STYLE

APA

Venkatesaramani, R., Downey, D., Malin, B., & Vorobeychik, Y. (2019). A semantic cover approach for topic modeling. In *SEM@NAACL-HLT 2019 - 8th Joint Conference on Lexical and Computational Semantics (pp. 92–102). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/s19-1011

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free