O-hetm: An online hierarchical entity topic model for news streams

7Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Nowadays, with the development of the Internet, large amount of continuous streaming news has become overwhelming to the public. Constructing a dynamic topic hierarchy which organizes the news articles according tomulti-grain topics can enable the users to catch whatever they are interested in as soon as possible. However, it is nontrivial due to the streaming and time-sensitive characteristics of news data. In this paper, to address the challenges, we propose a Hierarchical Entity Topic Model (HETM)which considers the timeliness of news data and the importance of named entities in conveying information of who/when/where in news articles. In addition, we propose onlineHETM(o-HETM) by presenting a fast online inference algorithm for HETM to adapt it to streaming news. For better understanding of topics, we extract key sentences for each topic to form a summary. Extensive experimental results demonstrate that our model HETM significantly improves the topic quality and time efficiency, compared to state-of-the-art method HLDA (Hierarchical Latent Dirichlet Allocation). In addition, our proposed o-HETM with an online inference algorithm further greatly improves the time efficiency and thus can be applicable to the streaming news.

Cite

CITATION STYLE

APA

Hu, L., Li, J., Zhang, J., & Shao, C. (2015). O-hetm: An online hierarchical entity topic model for news streams. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9077, pp. 696–707). Springer Verlag. https://doi.org/10.1007/978-3-319-18038-0_54

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free