Abstract
Labeling topics learned by topic models is a challenging problem. Previous studies have used words, phrases and images to label topics. In this paper, we propose to use text summaries for topic labeling. Several sentences are extracted from the most related documents to form the summary for each topic. In order to obtain summaries with both high relevance, coverage and discrimination for all the topics, we propose an algorithm based on sub-modular optimization. Both automatic and manual analysis have been conducted on two real document collections, and we find 1) the summaries extracted by our proposed algorithm are superior over the summaries extracted by existing popular summarization methods; 2) the use of summaries as labels has obvious advantages over the use of words and phrases.
Cite
CITATION STYLE
Wan, X., & Wang, T. (2016). Automatic labeling of topic models using text summaries. In 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers (Vol. 4, pp. 2297–2305). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/p16-1217
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.