Plink-LDA: Using link as prior information in topic modeling

Huan Xia; Juanzi Li; Jie Tang; Marie Francine Moens

Conference Proceedings

Plink-LDA: Using link as prior information in topic modeling

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7238 LNCS(PART 1) 213-227

DOI: 10.1007/978-3-642-29038-1_17

16Citations

23Readers

Get full text

Abstract

Citations are highly valuable for analyzing documents and have been widely studied in recent years. Among the document modeling, the citations are treated as documents' attributes just like the words in the documents; or as the degrees in graph theory. These methods add citations into word sampling process to reform the document representation but they miss the impact of the citations in the generation of content. In this paper, we view the citations as the prior information which authors have had. In the generation of document, content of the document is split into two parts: the idea of the author and the knowledge from the cited papers. We proposed a prior information enabled topic model-PLDA. In the modeling, both the document and its citations play the important role of generating the topic layer. Our experiments on two linked datasets show that our model greatly outperforms basic LDA procedures on a clustering task while also maintaining the dependencies among documents. In addition, we also show the feasibility by the task of citation recommendation. © 2012 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Xia, H., Li, J., Tang, J., & Moens, M. F. (2012). Plink-LDA: Using link as prior information in topic modeling. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7238 LNCS, pp. 213–227). https://doi.org/10.1007/978-3-642-29038-1_17

Plink-LDA: Using link as prior information in topic modeling

Abstract

Author supplied keywords

Cite

Register to see more suggestions