Automatic labelling of topic models using word vectors and letter trigram vectors

Wanqiu Kou; Fang Li; Timothy Baldwin

Conference Proceedings

Automatic labelling of topic models using word vectors and letter trigram vectors

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9460 253-264

DOI: 10.1007/978-3-319-28940-3_20

27Citations

45Readers

Get full text

Abstract

The native representation of LDA-style topics is a multinomial distributions over words, which can be time-consuming to interpret directly. As an alternative representation, automatic labelling has been shown to help readers interpret the topics more efficiently. We propose a novel framework for topic labelling using word vectors and letter trigram vectors. We generate labels automatically and propose automatic and human evaluations of our method. First, we use a chunk parser to generate candidate labels, then map topics and candidate labels to word vectors and letter trigram vectors in order to find which candidate label is more semantically related to that topic. A label can be found by calculating the similarity between a topic and its candidate label vectors. Experiments on three common datasets show that not only the labelling method, but also out approach to automatic evaluation is effective.

Author supplied keywords

Cite

CITATION STYLE

APA

Kou, W., Li, F., & Baldwin, T. (2015). Automatic labelling of topic models using word vectors and letter trigram vectors. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9460, pp. 253–264). Springer Verlag. https://doi.org/10.1007/978-3-319-28940-3_20

Automatic labelling of topic models using word vectors and letter trigram vectors

Abstract

Author supplied keywords

Cite

Register to see more suggestions