Abstract
The treebanks provided by the Universal Dependencies (UD) initiative are a state-of-the-art resource for cross-lingual and monolingual syntax-based linguistic studies, as well as for multilingual dependency parsing. Creating a UD treebank for a language helps further the UD initiative by providing an important dataset for research and natural language processing in that language. In this paper, we describe how we created a UD treebank for Latvian, and how we obtained both the basic and enhanced UD representations from the data in Latvian Treebank which is annotated according to a hybrid dependency-constituency grammar model. The hybrid model was inspired by Lucien Tesnière’s dependency grammar theory and its notion of a syntactic nucleus. While the basic UD representation is already a de facto standard in NLP, the enhanced UD representation is just emerging, and the treebank described here is among the first to provide both representations.
Author supplied keywords
Cite
CITATION STYLE
Pretkalniņa, L., Rituma, L., & Saulīte, B. (2018). Deriving enhanced universal dependencies from a hybrid dependency-constituency treebank. In Lecture Notes in Computer Science (Vol. 11107 LNAI, pp. 95–105). Springer Verlag. https://doi.org/10.1007/978-3-030-00794-2_10
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.