Capturing divergence in dependency trees to improve syntactic projection

Ryan Georgi; Fei Xia; William D. Lewis

Journal Article

Capturing divergence in dependency trees to improve syntactic projection

Language Resources and Evaluation (2014) 48(4) 709-739

DOI: 10.1007/s10579-014-9273-4

5Citations

8Readers

Get full text

Abstract

Obtaining syntactic parses is an important step in many NLP pipelines. However, most of the world’s languages do not have a large amount of syntactically annotated data available for building parsers. Syntactic projection techniques attempt to address this issue by using parallel corpora consisting of resource-poor and resource-rich language pairs, taking advantage of a parser for the resource-rich language and word alignment between the languages to project the parses onto the data for the resource-poor language. These projection methods can suffer, however, when syntactic structures for some sentence pairs in the two languages look quite different. In this paper, we investigate the use of small, parallel, annotated corpora to automatically detect divergent structural patterns between two languages. We then use these detected patterns to improve projection algorithms and dependency parsers, allowing for better performing NLP tools for resource-poor languages, particularly those that may not have large amounts of annotated data necessary for traditional, fully-supervised methods. While this detection process is not exhaustive, we demonstrate that common patterns of divergence can be identified automatically without prior knowledge of a given language pair, and the patterns can be used to improve performance of syntactic projection and parsing.

Author supplied keywords

Cite

CITATION STYLE

APA

Georgi, R., Xia, F., & Lewis, W. D. (2014). Capturing divergence in dependency trees to improve syntactic projection. Language Resources and Evaluation, 48(4), 709–739. https://doi.org/10.1007/s10579-014-9273-4

Capturing divergence in dependency trees to improve syntactic projection

Abstract

Author supplied keywords

Cite

Register to see more suggestions