Clustering-Based Online News Topic Detection and Tracking through Hierarchical Bayesian Nonparametric Models

10Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we propose a clustering-based online news topic detection and tracking (TDT) approach based on hierarchical Bayesian nonparametric framework that allows topics to be shared across different news stories in a corpus. Our approach is formulated using the hierarchical Pitman-Yor process mixture model with the inverted Beta-Liouville (IBL) distribution as its component density, which has shown superior performance in modeling text data than the widely used Gaussian distribution. Moreover, we theoretically develop a convergence-guaranteed online learning algorithm that can effectively learn the proposed TDT model from a stream of news stories based on varational Bayes. The merits of our TDT approach are illustrated by comparing it with other well-defined clustering-based TDT approaches on different news data sets.

Cite

CITATION STYLE

APA

Fan, W., Guo, Z., Bouguila, N., & Hou, W. (2021). Clustering-Based Online News Topic Detection and Tracking through Hierarchical Bayesian Nonparametric Models. In SIGIR 2021 - Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 2126–2130). Association for Computing Machinery, Inc. https://doi.org/10.1145/3404835.3462982

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free