We present a collection of parallel treebanks that have been automatically aligned on both the terminal and the non-terminal constituent level for use in syntax-based machine translation. We describe how they were constructed and applied to a syntax- and example-based machine translation system called Parse and Corpus-Based Machine Translation (PaCo-MT). For the language pair Dutch to English, we present non-terminal alignment evaluation scores for a variety of tree alignment approaches. Finally, based on the parallel treebanks created by these approaches, we evaluate the MT system itself and compare the scores with those of Moses, a current state-of-the-art statistical MT system, when trained on the same data.
CITATION STYLE
Kotzé, G., Vandeghinste, V., Martens, S., & Tiedemann, J. (2017). Large aligned treebanks for syntax-based machine translation. Language Resources and Evaluation, 51(2), 249–282. https://doi.org/10.1007/s10579-016-9369-0
Mendeley helps you to discover research relevant for your work.