Abstract
We present a first analysis of interannotator agreement for the DIT ++ tagset of dialogue acts, a comprehensive, layered, multidimensional set of 86 tags. Within a dimension or a layer, subsets of tags are often hierarchically organised. We argue that especially for such highly structured annotation schemes the well-known kappa statistic is not an adequate measure of inter-annotator agreement. Instead, we propose a statistic that takes the structural properties of the tagset into account, and we discuss the application of this statistic in an annotation experiment. The experiment shows promising agreement scores for most dimensions in the tagset and provides useful insights into the usability of the annotation scheme, but also indicates that several additional factors influence annotator agreement. We finally suggest that the proposed approach for measuring agreement per dimension can be a good basis for measuring annotator agreement over the dimensions of a multidimensional annotation scheme. © 2006 Association for Computational Linguistics.
Cite
CITATION STYLE
Geertzen, J., & Bunt, H. (2006). Measuring annotator agreement in a complex hierarchical dialogue act annotation scheme. In COLING/ACL 2006 - SIGdial06: 7th SIGdial Workshop on Discourse and Dialogue, Proceedings of the Workshop (pp. 126–133). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1654595.1654619
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.