Forest-based translation rule extraction

81Citations
Citations of this article
119Readers
Mendeley users who have this article in their library.

Abstract

Translation rule extraction is a fundamental problem in machine translation, especially for linguistically syntax-based systems that need parse trees from either or both sides of the bitext. The current dominant practice only uses 1-best trees, which adversely affects the rule set quality due to parsing errors. So we propose a novel approach which extracts rules from a packed forest that compactly encodes exponentially many parses. Experiments show that this method improves translation quality by over 1 BLEU point on a state-of-the-art tree-to-string system, and is 0.5 points better than (and twice as fast as) extracting on 30- best parses. When combined with our previous work on forest-based decoding, it achieves a 2.5 BLEU points improvement over the baseline, and even outperforms the hierarchical system of Hiero by 0.7 points. © 2008 Association for Computational Linguistics.

Cite

CITATION STYLE

APA

Mi, H., & Huang, L. (2008). Forest-based translation rule extraction. In EMNLP 2008 - 2008 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference: A Meeting of SIGDAT, a Special Interest Group of the ACL (pp. 206–214). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1613715.1613745

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free