Measuring annotator agreement in a complex hierarchical dialogue act annotation scheme

20Citations
Citations of this article
20Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We present a first analysis of interannotator agreement for the DIT ++ tagset of dialogue acts, a comprehensive, layered, multidimensional set of 86 tags. Within a dimension or a layer, subsets of tags are often hierarchically organised. We argue that especially for such highly structured annotation schemes the well-known kappa statistic is not an adequate measure of inter-annotator agreement. Instead, we propose a statistic that takes the structural properties of the tagset into account, and we discuss the application of this statistic in an annotation experiment. The experiment shows promising agreement scores for most dimensions in the tagset and provides useful insights into the usability of the annotation scheme, but also indicates that several additional factors influence annotator agreement. We finally suggest that the proposed approach for measuring agreement per dimension can be a good basis for measuring annotator agreement over the dimensions of a multidimensional annotation scheme. © 2006 Association for Computational Linguistics.

Cite

CITATION STYLE

APA

Geertzen, J., & Bunt, H. (2006). Measuring annotator agreement in a complex hierarchical dialogue act annotation scheme. In COLING/ACL 2006 - SIGdial06: 7th SIGdial Workshop on Discourse and Dialogue, Proceedings of the Workshop (pp. 126–133). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1654595.1654619

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free