Continuous summarization over microblog threads

N/ACitations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

With the dramatic growth of social media users, microblogs are created and shared at an unprecedented rate. The high velocity and large volumes of short text posts (microblogs) bring redundancies and noise, making it hard for users and analysts to elicit useful information. In this paper, we formalize the problem from a summarization angle-Continuous Summarization over Microblog Threads (CSMT), which considers three facets: information gain of the microblog dialogue, diversity, and temporal information. This summarization problem is different from the classic ones in two aspects: (i) It is considered over a large-scale, dynamic data with high updating frequency; (ii) the context between microblogs are taken into account. We first prove that the CSMT problem is NP-hard. Then we propose a greedy algorithm with (1 − 1/e) performance guarantee. Finally we extend the greedy algorithm on the sliding window to continuously summarize microblogs for threads. Our experimental results on large-scale datasets show that our method is more superior than other two baselines in terms of summary diversity and information gain, with a close time cost to the best performed baseline.

Cite

CITATION STYLE

APA

Song, L., Zhang, P., Bao, Z., & Sellis, T. (2017). Continuous summarization over microblog threads. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10178 LNCS, pp. 511–526). Springer Verlag. https://doi.org/10.1007/978-3-319-55699-4_31

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free