Recent research on shallow discourse parsing has given renewed attention to the role of discourse relation signals, in particular explicit connectives and so-called alternative lexicalizations. In our work, we first develop new models for extracting signals and classifying their senses, both for explicit connectives and alternative lexicalizations, based on the Penn Discourse Treebank v3 corpus. Thereafter, we apply these models to various raw corpora, and we introduce 'discourse sense flows', a new way of modeling the rhetorical style of a document by the linear order of coherence relations, as captured by the PDTB senses. The corpora span several genres and domains, and we undertake comparative analyses of the sense flows, as well as experiments on automatic genre/domain discrimination using discourse sense flow patterns as features. We find that n-gram patterns are indeed stronger predictors than simple sense (unigram) distributions.
CITATION STYLE
Knaebel, R., & Stede, M. (2023). Discourse Sense Flows: Modelling the Rhetorical Style of Documents across Various Domains. In Findings of the Association for Computational Linguistics: EMNLP 2023 (pp. 14462–14482). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.findings-emnlp.964
Mendeley helps you to discover research relevant for your work.