Abstract
Accurate identification of phrasal translation equivalents is critical to both phrase-based and syntax-based machine translation systems. We show that the extraction of many phrasal translation equivalents is made impossible by word alignments done without taking syntactic structures into consideration. To address the problem, we propose a new annotation scheme where word alignment and the alignment of non-terminal nodes (i.e., phrases) are done simultaneously to avoid conflicts between word alignments and syntactic structures. Relying on this new alignment approach, we construct a Hierarchically Aligned Chinese-English Parallel Treebank (HACEPT), and show that all phrasal translation equivalents can be automatically extracted based on the phrase alignments in HACEPT.
Cite
CITATION STYLE
Deng, D., Xue, N., & Guo, S. (2015). Harmonizing word alignments and syntactic structures for extracting phrasal translation equivalents. In Proceedings of SSST 2015: 9th Workshop on Syntax, Semantics and Structure in Statistical Translation - NAACL HLT 2015 / SIGMT / SIGLEX Workshop (pp. 1–9). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/w15-1001
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.