Extending the TüBa-D/Z treebank with GermaNet sense annotation

Verena Henrich; Erhard Hinrichs

Conference Proceedings

Extending the TüBa-D/Z treebank with GermaNet sense annotation

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2013) 8105 LNAI 89-96

DOI: 10.1007/978-3-642-40722-2_9

5Citations

8Readers

Get full text

Abstract

This paper describes the manual construction of a sense-annotated corpus for German with the goal of providing a gold standard for word sense disambiguation. The underlying textual resource, the TüBa-D/Z treebank, is a German newspaper corpus already manually enriched with high-quality, manual annotations at various levels of grammar. The sense inventory used for tagging word senses is taken from GermaNet [8,9], the German counterpart of the Princeton WordNet for English [6]. With the sense annotation for a selected set of 109 words (30 nouns and 79 verbs) occurring together more than 15 500 times in the TüBa-D/Z, the treebank currently represents the largest manually sense-annotated corpus available for GermaNet. © 2013 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Henrich, V., & Hinrichs, E. (2013). Extending the TüBa-D/Z treebank with GermaNet sense annotation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8105 LNAI, pp. 89–96). https://doi.org/10.1007/978-3-642-40722-2_9

Extending the TüBa-D/Z treebank with GermaNet sense annotation

Abstract

Author supplied keywords

Cite

Register to see more suggestions