News articles express information by concentrating on named entities like who, when, and where in news. Whereas, extracting the relationships among entities, words and topics through a large amount of news articles is nontrivial. Topic modeling like Latent Dirichlet Allocation has been applied a lot to mine hidden topics in text analysis, which have achieved considerable performance. However, it cannot explicitly show relationship between words and entities. In this paper, we propose a generative model, Entity-Centered Topic Model(ECTM) to summarize the correlation among entities, words and topics by taking entity topic as a mixture of word topics. Experiments on real news data sets show our model of a lower perplexity and better in clustering of entities than state-of-the-art entity topic model(CorrLDA2). We also present analysis for results of ECTM and further compare it with CorrLDA2. © Springer-Verlag Berlin Heidelberg 2013.
CITATION STYLE
Hu, L., Li, J., Li, Z., Shao, C., & Li, Z. (2013). Incorporating entities in news topic modeling. In Communications in Computer and Information Science (Vol. 400, pp. 139–150). Springer Verlag. https://doi.org/10.1007/978-3-642-41644-6_14
Mendeley helps you to discover research relevant for your work.