Non-canonical language is not harder to annotate than canonical language

1Citations
Citations of this article
68Readers
Mendeley users who have this article in their library.

Abstract

As researchers developing robust NLP for a wide range of text types, we are often confronted with the prejudice that annotation of non-canonical language (whatever that means) is somehow more arbitrary than annotation of canonical language. To investigate this, we present a small annotation study where annotators were asked, with minimal guidelines, to identify main predicates and arguments in sentences across five different domains, ranging from newswire to Twitter. Our study indicates that (at least such) annotation of non-canonical language is not harder. However, we also observe that agreements in social media domains correlate less with model confidence, suggesting that maybe annotators disagree for different reasons when annotating social media data.

Cite

CITATION STYLE

APA

Plank, B., Alonso, H. M., & Søgaard, A. (2020). Non-canonical language is not harder to annotate than canonical language. In LAW 2015 - 9th Linguistic Annotation Workshop, held in conjuncion with NAACL 2015 - Proceedings of the Workshop (pp. 148–151). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/w15-1617

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free