Word embedding-based biomedical text summarization

Oussama Rouane; Hacene Belhadef; Mustapha Bouakkaz

Conference Proceedings

Word embedding-based biomedical text summarization

Advances in Intelligent Systems and Computing (2020) 1073 288-297

DOI: 10.1007/978-3-030-33582-3_28

4Citations

14Readers

Get full text

Abstract

In this paper, we have proposed a novel word embedding-based biomedical text summarizer. Biomedical words are represented by real dense vectors. Sentences are represented by summing-up the word vectors that contain. The PageRank algorithm is applied to rank sentences using the cosine similarity as a distance measure between sentences vectors. The top N highly ranked sentences are selected to build the summary. For the evaluation, we created a corpus of 200 biomedical papers downloaded from the Biomed Central full-text database. We used a pre-trained Word2vec model of word vectors generated from a combination of PubMed, PMC, and recent English Wikipedia dump texts. We compared our method with four other summarizers using: ROUGE-1, ROUGE-2, ROUGE-3, and ROUGE-SU4 metrics by evaluating the generated summaries with the abstracts of papers. Our summarizer achieved an improvement of 3.48%, 7.68%, 9.76%, and 3.47% respectively against the second-ranked summarizer.

Author supplied keywords

Cite

CITATION STYLE

APA

Rouane, O., Belhadef, H., & Bouakkaz, M. (2020). Word embedding-based biomedical text summarization. In Advances in Intelligent Systems and Computing (Vol. 1073, pp. 288–297). Springer. https://doi.org/10.1007/978-3-030-33582-3_28

Word embedding-based biomedical text summarization

Abstract

Author supplied keywords

Cite

Register to see more suggestions