Abstract
Word alignment is to find translationally equivalent words between source and target sentences. Previous work has demonstrated that self-training can achieve competitive word alignment results. In this paper, we propose to use word alignments generated by a third-party word aligner to supervise the neural word alignment training. Specifically, source word and target word of each word pair aligned by the third-party aligner are trained to be close neighbors to each other in the contextualized embedding space when fine-tuning a pre-trained cross-lingual language model. Experiments on the benchmarks of various language pairs show that our approach can surprisingly do self-correction over the third-party supervision by finding more accurate word alignments and deleting wrong word alignments, leading to better performance than various third-party word aligners, including the currently best one. When we integrate all supervisions from various third-party aligners, we achieve state-of-the-art word alignment performances, with averagely more than two points lower alignment error rates than the best third-party aligner.We released our code at https://github.com/sdongchuanqi/Third-Party-Supervised-Aligner.
Cite
CITATION STYLE
Zhang, J., Dong, C., Duan, X., Zhang, Y., & Zhang, M. (2022). Third-Party Aligner for Neural Word Alignments. In Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 3134–3145). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.findings-emnlp.285
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.