The main goal of this paper was to improve topic modelling algorithms by introducing automatic topic labelling, a procedure which chooses a label for a cluster of words in a topic. Topic modelling is a widely used statistical technique which allows to reveal internal conceptual organization of text corpora. We have chosen an unsupervised graph-based method and elaborated it with regard to Russian. The proposed algorithm consists of two stages: candidate generation by means of PageRank and morphological filters, and candidate ranking. Our topic labelling experiments on a corpus of encyclopaedic texts on linguistics has shown the advantages of labelled topic models for NLP applications.
CITATION STYLE
Mirzagitova, A., & Mitrofanova, O. (2019). Automatic assignment of labels in Topic Modelling for Russian Corpora. In ExLing 2016: Proceedings of 7th Tutorial and Research Workshop on Experimental Linguistics (Vol. 7, pp. 115–118). ExLing Society. https://doi.org/10.36505/exling-2016/07/0025/000284
Mendeley helps you to discover research relevant for your work.