Annotating multiparty discourse: Challenges for agreement metrics

5Citations
Citations of this article
80Readers
Mendeley users who have this article in their library.

Abstract

To computationally model discourse phenomena such as argumentation we need corpora with reliable annotation of the phenomena under study. Annotating complex discourse phenomena poses two challenges: fuzziness of unit boundaries and the need for multiple annotators. We show that current metrics for inter-annotator agreement (IAA) such as P/R/F1 and Krippendorff's α provide inconsistent results for the same text. In addition, IAA metrics do not tell us what parts of a text are easier or harder for human judges to annotate and so do not provide sufficiently specific information for evaluating systems that automatically identify discourse units. We propose a hierarchical clustering approach that aggregates overlapping text segments of text identified by multiple annotators; the more annotators who identify a text segment, the easier we assume that the text segment is to annotate. The clusters make it possible to quantify the extent of agreement judges show about text segments; this information can be used to assess the output of systems that automatically identify discourse units.

Cite

CITATION STYLE

APA

Wacholder, N., Muresan, S., Ghosh, D., & Aakhus, M. (2020). Annotating multiparty discourse: Challenges for agreement metrics. In LAW 2014 - 8th Linguistic Annotation Workshop, in conjunction with COLING 2014 - Proceedings of the Workshop (pp. 120–128). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/w14-4918

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free