We demonstrate that “hallucinating” phrasal translations can significantly improve the quality of machine translation in low resource conditions. Our hallucinated phrase tables consist of entries composed from multiple unigram translations drawn from the baseline phrase table and from translations that are induced from monolingual corpora. The hallucinated phrase table is very noisy. Its translations are low precision but high recall. We counter this by introducing 30 new feature functions (including a variety of monolingually-estimated features) and by aggressively pruning the phrase table. Our analysis evaluates the intrinsic quality of our hallucinated phrase pairs as well as their impact in end-to-end Spanish-English and Hindi-English MT.
CITATION STYLE
Irvine, A., & Callison-Burch, C. (2014). Hallucinating phrase translations for low resource MT. In CoNLL 2014 - 18th Conference on Computational Natural Language Learning, Proceedings (pp. 160–170). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/w14-1617
Mendeley helps you to discover research relevant for your work.