Split or merge: Which is better for unsupervised RST parsing?

16Citations
Citations of this article
80Readers
Mendeley users who have this article in their library.

Abstract

Rhetorical Structure Theory (RST) parsing is crucial for many downstream NLP tasks that require a discourse structure for a text. Most of the previous RST parsers have been based on supervised learning approaches. That is, they require an annotated corpus of sufficient size and quality, and heavily rely on the language and domain dependent corpus. In this paper, we present two language-independent unsupervised RST parsing methods based on dynamic programming. The first one builds the optimal tree in terms of a dissimilarity score function that is defined for splitting a text span into smaller ones. The second builds the optimal tree in terms of a similarity score function that is defined for merging two adjacent spans into a large one. Experimental results on English and German RST treebanks showed that our parser based on span merging achieved the best score, around 0.8 F1 score, which is close to the scores of the previous supervised parsers.

Cite

CITATION STYLE

APA

Kobayashi, N., Hirao, T., Nakamura, K., Kamigaito, H., Okumura, M., & Nagata, M. (2019). Split or merge: Which is better for unsupervised RST parsing? In EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference (pp. 5797–5802). Association for Computational Linguistics. https://doi.org/10.18653/v1/d19-1587

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free