Discourse segmentation and sentence-level discourse parsing play important roles for various NLP tasks to consider textual coherence. Despite recent achievements in both tasks, there is still room for improvement due to the scarcity of labeled data. To solve the problem, we propose a language model-based generative classifier (LMGC) for using more information from labels by treating the labels as an input while enhancing label representations by embedding descriptions for each label. Moreover, since this enables LMGC to make ready the representations for labels, unseen in the pre-training step, we can effectively use a pretrained language model in LMGC. Experimental results on the RST-DT dataset show that our LMGC achieved the state-of-the-art F1 score of 96.72 in discourse segmentation. It further achieved the state-of-the-art relation F1 scores of 84.69 with gold EDU boundaries and 81.18 with automatically segmented boundaries, respectively, in sentence-level discourse parsing.
CITATION STYLE
Zhang, Y., Kamigaito, H., & Okumura, M. (2021). A Language Model-based Generative Classifier for Sentence-level Discourse Parsing. In EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 2432–2446). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.emnlp-main.188
Mendeley helps you to discover research relevant for your work.