The treebanks provided by the Universal Dependencies (UD) initiative are a state-of-the-art resource for cross-lingual and monolingual syntax-based linguistic studies, as well as for multilingual dependency parsing. Creating a UD treebank for a language helps further the UD initiative by providing an important dataset for research and natural language processing in that language. In this paper, we describe how we created a UD treebank for Latvian, and how we obtained both the basic and enhanced UD representations from the data in Latvian Treebank which is annotated according to a hybrid dependency-constituency grammar model. The hybrid model was inspired by Lucien Tesnière’s dependency grammar theory and its notion of a syntactic nucleus. While the basic UD representation is already a de facto standard in NLP, the enhanced UD representation is just emerging, and the treebank described here is among the first to provide both representations.
CITATION STYLE
Pretkalniņa, L., Rituma, L., & Saulīte, B. (2018). Deriving enhanced universal dependencies from a hybrid dependency-constituency treebank. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11107 LNAI, pp. 95–105). Springer Verlag. https://doi.org/10.1007/978-3-030-00794-2_10
Mendeley helps you to discover research relevant for your work.