Applying word co-occurrence graph in enhancing LDA model for topic discovering in large-scaled text corpus

0Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Topic modeling, such as LDA is considered as a useful tool for the statistical analysis of text document collections and other text-based data. Recently, topic modeling becomes an attractive researching field due to its wide applications. However, there are remained disadvantages of traditional topic modeling like as LDA due the shortcoming of bag-of-words (BOW) model as well as low-performance in handle large text corpus. Therefore, in this paper, we present a novel approach of topic model, called LDA-GOW, which is the combination of word co-occurrence, also called: graph-of-words (GOW) model and traditional LDA topic discovering model. The LDA-GOW topic model not only enable to extract more informative topics from text but also be able to leverage the topic discovering process from large-scaled text corpus. We test our proposed model in comparing with the traditional LDA topic model, within several standardized datasets, include: WebKB, Reuters-R8 and annotated scientific documents which are collected from ACM digital library to demonstrate the effectiveness of our proposed model. For overall experiments, our proposed LDA-GOW model gains approximately 70.86% in accuracy.

Cite

CITATION STYLE

APA

Pham, P., & Do, P. (2019). Applying word co-occurrence graph in enhancing LDA model for topic discovering in large-scaled text corpus. International Journal of Recent Technology and Engineering, 8(2 Special Issue 8), 1366–1371. https://doi.org/10.35940/ijrte.B1068.0882S819

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free