Automatic labelling of topic models learned from Twitter by summarisation

Amparo Elizabeth Cano Basave; Yulan He; Ruifeng Xu

Conference ProceedingsOPEN ACCESS

Automatic labelling of topic models learned from Twitter by summarisation

52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014 - Proceedings of the Conference (2014) 2 618-624

DOI: 10.3115/v1/p14-2101

42Citations

167Readers

Abstract

Latent topics derived by topic models such as Latent Dirichlet Allocation (LDA) are the result of hidden thematic structures which provide further insights into the data. The automatic labelling of such topics derived from social media poses however new challenges since topics may characterise novel events happening in the real world. Existing automatic topic labelling approaches which depend on external knowledge sources become less applicable here since relevant articles/concepts of the extracted topics may not exist in external sources. In this paper we propose to address the problem of automatic labelling of latent topics learned from Twitter as a summarisation problem. We introduce a framework which apply summarisation algorithms to generate topic labels. These algorithms are independent of external sources and only rely on the identification of dominant terms in documents related to the latent topic. We compare the efficiency of existing state of the art summarisation algorithms. Our results suggest that summarisation algorithms generate better topic labels which capture event-related context compared to the top-n terms returned by LDA. © 2014 Association for Computational Linguistics.

Cite

CITATION STYLE

APA

Cano Basave, A. E., He, Y., & Xu, R. (2014). Automatic labelling of topic models learned from Twitter by summarisation. In 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014 - Proceedings of the Conference (Vol. 2, pp. 618–624). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/p14-2101

Automatic labelling of topic models learned from Twitter by summarisation

Abstract

Cite

Register to see more suggestions