A new automatic multi-document text summarization using topic modeling

10Citations
Citations of this article
21Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper proposes a novel methodology to generate an extractive text summary from a corpus of documents. Unlike most existing methods, our approach is designed in such a way that the final generated summary covers all the important topics from a corpus of documents. We propose a heuristic method which uses the Latent Dirichlet Allocation technique to identify the optimum number of independent topics present in the corpus. Some of the sentences are identified as the important sentences from each independent topic using a set of word and sentence level features. In order to ensure that the final summary is coherent, we suggest a novel technique to reorder the sentences based on sentence similarity. The use of topic modeling ensures that all the important content from the corpus of documents is captured in the extracted summary which in turn strengthen the summary. Experimental results show that the proposed approach is promising.

Cite

CITATION STYLE

APA

Roul, R. K., Mehrotra, S., Pungaliya, Y., & Sahoo, J. K. (2019). A new automatic multi-document text summarization using topic modeling. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11319 LNCS, pp. 212–221). Springer Verlag. https://doi.org/10.1007/978-3-030-05366-6_17

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free