Discovering Interpretable Topics by Leveraging Common Sense Knowledge

Ismail Harrando; Raphaël Troncy

Conference ProceedingsOPEN ACCESS

Discovering Interpretable Topics by Leveraging Common Sense Knowledge

K-CAP 2021 - Proceedings of the 11th Knowledge Capture Conference (2021) 265-268

DOI: 10.1145/3460210.3493586

2Citations

10Readers

Get full text

Abstract

Traditional topic modeling approaches generally rely on document-term co-occurrence statistics to find latent topics in a collection of documents. However, relying only on such statistics can yield incoherent or hard to interpret results for the end-users in many applications where the interest lies in interpreting the resulting topics (e.g. labeling documents, comparing corpora, guiding content exploration, etc.). In this work, we propose to leverage external common sense knowledge, i.e. information from the real world beyond word co-occurrence, to find topics that are more coherent and more easily interpretable by humans. We introduce the Common Sense Topic Model (CSTM), a novel and efficient approach that augments clustering with knowledge extracted from the ConceptNet knowledge graph. We evaluate this approach on several datasets alongside commonly used models using both automatic and human evaluation, and we show how it shows superior affinity to human judgement. The code for the experiments as well as the training data and human evaluation are available at https://github.com/D2KLab/CSTM.

Author supplied keywords

Cite

CITATION STYLE

APA

Harrando, I., & Troncy, R. (2021). Discovering Interpretable Topics by Leveraging Common Sense Knowledge. In K-CAP 2021 - Proceedings of the 11th Knowledge Capture Conference (pp. 265–268). Association for Computing Machinery, Inc. https://doi.org/10.1145/3460210.3493586

Discovering Interpretable Topics by Leveraging Common Sense Knowledge

Abstract

Author supplied keywords

Cite

Register to see more suggestions