Abstract
In this paper, we present a unigram segmentation model for statistical machine translation where the segmentation units are blocks: pairs of phrases without internal structure. The segmentation model uses a novel orientation component to handle swapping of neighbor blocks. During training, we collect block unigram counts with orientation: we count how often a block occurs to the left or to the right of some predecessor block. The orientation model is shown to improve translation performance over two models: 1) no block re-ordering is used, and 2) the block swapping is controlled only by a language model. We show experimental results on a standard Arabic-English translation task.
Cite
CITATION STYLE
Tillmann, C. (2004). A unigram orientation model for statistical machine translation. In HLT-NAACL 2004 - Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, Short Papers (pp. 101–104). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1613984.1614010
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.