News headline generation based on improved decoder from transformer

11Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Most of the news headline generation models that use the sequence-to-sequence model or recurrent network have two shortcomings: the lack of parallel ability of the model and easily repeated generation of words. It is difficult to select the important words in news and reproduce these expressions, resulting in the headline that inaccurately summarizes the news. In this work, we propose a TD-NHG model, which stands for news headline generation based on an improved decoder from the transformer. The TD-NHG uses masked multi-head self-attention to learn the feature information of different representation subspaces of news texts and uses decoding selection strategy of top-k, top-p, and punishment mechanisms (repetition-penalty) in the decoding stage. We conducted a comparative experiment on the LCSTS dataset and CSTS dataset. Rouge-1, Rouge-2, and Rouge-L on the LCSTS dataset and CSTS dataset are 31.28/38.73, 12.68/24.97, and 28.31/37.47, respectively. The experimental results demonstrate that the proposed method can improve the accuracy and diversity of news headlines.

Cite

CITATION STYLE

APA

Li, Z., Wu, J., Miao, J., & Yu, X. (2022). News headline generation based on improved decoder from transformer. Scientific Reports, 12(1). https://doi.org/10.1038/s41598-022-15817-z

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free