Semantic Annotation of MASC

  • Baker C
  • Fellbaum C
  • J. Passonneau R
N/ACitations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Word Sense Disambiguation (WSD) continues to present a formidable challenge for Natural Language Processing. To better perform automatic WSD, manually annotated corpora are created that serve as training and testing data. When the annotation labels are drawn from an independently created lexical resource, there is an added benefit of checking the resources’ lexical inventory and sense representations against the corpus data. Such corrections can in turn benefit future manual and automatic annotation. We report on the annotation of a number of selected word forms of different parts of speech in the MASC corpus with WordNet senses. Analyses of the annotations reveal good annotator agreement for half of the lemmas but low agreement for the other half, with no obvious indications for the reasons. Through crowdsourcing, however, instead of a single label per word, we had many annotators assign labels to each word to create a corpus where we can infer a single ground truth label per sentence from the many labels, along with a confidence. Even for words with low agreement, many of the instances have confident labels. In a complementary effort, 100 of the MASC sentences with WordNet-annotated lemmas were fully annotated with FrameNet lexical units and Frame Elements. This allowed for the comparison between, and alignment of, the WordNet and FrameNet senses for the chosen lemmas. We reflect on the fundamental design differences between these two complementary resources and their respective contributions to WSD. The MASC word sense annotation effort has demonstrated that it is possible to collect reliable manual annotations of moderately polysemous words, and that we do not yet know what makes this possible for some words and not others. The corpus, therefore, can serve as a valuable resource for investigating this question.

Cite

CITATION STYLE

APA

Baker, C., Fellbaum, C., & J. Passonneau, R. (2017). Semantic Annotation of MASC. In Handbook of Linguistic Annotation (pp. 699–717). Springer Netherlands. https://doi.org/10.1007/978-94-024-0881-2_25

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free