Collaboratively Modeling and Embedding of Latent Topics for Short Texts

Zheng Liu; Tingting Qin; Ke Jia Chen; Yun Li

Journal ArticleOPEN ACCESS

Collaboratively Modeling and Embedding of Latent Topics for Short Texts

IEEE Access (2020) 8 99141-99153

DOI: 10.1109/ACCESS.2020.2997973

15Citations

29Readers

Abstract

Deriving a successful document representation is the critical challenge in many downstream tasks in NLP, especially when documents are very short. It is challenging to handle the sparsity and the noise problems confronting short texts. Some approaches employ latent topic models, based on global word co-occurrence, to obtain topic distribution as the representation. Others leverage word embeddings, which consider local conditional dependencies, to map a document as a summation vector of them. Unlike the existing works which explore the strategy of utilizing one to help the other, i.e., topic models for word embeddings or vice versa, we propose CME-DMM, a collaboratively modeling and embedding framework for capturing coherent latent topics from short texts. CME-DMM incorporates topic and word embeddings through the attention mechanism and implants them into the latent topic models, which significantly improve the quality of latent topics. Extensive experiments demonstrate that CME-DMM could perceive more coherent topics than other popular methods, resulting in a better performance in downstream NLP tasks such as classification. Besides the interpretable latent topics, the corresponding topic embeddings can describe the meanings of latent topics in the semantic space. The attention vectors, as a by-product of the learning process, can identify the keywords in noisy short texts.

Author supplied keywords

Cite

CITATION STYLE

APA

Liu, Z., Qin, T., Chen, K. J., & Li, Y. (2020). Collaboratively Modeling and Embedding of Latent Topics for Short Texts. IEEE Access, 8, 99141–99153. https://doi.org/10.1109/ACCESS.2020.2997973

Collaboratively Modeling and Embedding of Latent Topics for Short Texts

Abstract

Author supplied keywords

Cite

Register to see more suggestions