Global Encoding for Long Chinese Text Summarization

Xuefeng Xi; Zhou Pi; Guodong Zhou

Journal ArticleOPEN ACCESS

Global Encoding for Long Chinese Text Summarization

ACM Transactions on Asian and Low-Resource Language Information Processing (2020) 19(6)

DOI: 10.1145/3407911

20Citations

26Readers

Abstract

Text summarization is one of the significant tasks of natural language processing, which automatically converts text into a summary. Some summarization systems, for short/long English, and short Chinese text, benefit from advances in the neural encoder-decoder model because of the availability of large datasets. However, the long Chinese text summarization research has been limited to datasets of a couple of hundred instances. This article aims to explore the long Chinese text summarization task. To begin with, we construct a first large-scale, long Chinese text summarization corpus, the Long Chinese Summarization of Police Inquiry Record Text (LCSPIRT). Based on this corpus, we propose a sequence-to-sequence (Seq2Seq) model that incorporates a global encoding process with an attention mechanism. Our model achieves a competitive result on the LCSPIRT corpus compared with several benchmark methods.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Xi, X., Pi, Z., & Zhou, G. (2020). Global Encoding for Long Chinese Text Summarization. ACM Transactions on Asian and Low-Resource Language Information Processing, 19(6). https://doi.org/10.1145/3407911

Readers' Seniority

PhD / Post grad / Masters / Doc 5

50%

Lecturer / Post doc 2

20%

Researcher 2

20%

Professor / Associate Prof. 1

10%

Readers' Discipline

Computer Science 11

85%

Medicine and Dentistry 1

Chemistry 1

Global Encoding for Long Chinese Text Summarization

Abstract

Author supplied keywords

References Powered by Scopus

Rethinking the Inception Architecture for Computer Vision

Get to the point: Summarization with pointer-generator networks

LexRank: Graph-based lexical centrality as salience in text summarization

Cited by Powered by Scopus

Few-Shot Fine-Tuning SOTA Summarization Models for Medical Dialogues

Extractive summarization of Malayalam documents using latent Dirichlet allocation: An experience

Learning to Summarize Chinese Radiology Findings with a Pre-Trained Encoder

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline