A Distributed Inflection Model for Translating into Morphologically Rich Languages

Ke Tran; Arianna Bisazza; Christof Monz

Journal Article

A Distributed Inflection Model for Translating into Morphologically Rich Languages

Tran K
Bisazza A
Monz C

MT-Summit-2015 (2015) 1 145-159

N/ACitations

46Readers

Abstract

Lexical sparsity is a major challenge for machine translation into morphologically rich lan-guages. We address this problem by modeling sequences of fine-grained morphological tags in a bilingual context. To overcome the issue of ambiguous word analyses, we introduce soft tags, which are under-specified representations retaining all possible morphological attributes of a word. In order to learn distributed representations for the soft tags and their interactions we adopt a neural network approach. This approach allows for the combination of source and target side information to model a wide range of inflection phenomena. Our re-inflection ex-periments show a substantial increase in accuracy compared to a model trained on morpholog-ically disambiguated data. Integrated into an SMT decoder and evaluated for English-Italian and English-Russian translation, our model yields improvements of up to 1.0 BLEU over a competitive baseline.

Cite

CITATION STYLE

APA

Tran, K., Bisazza, A., & Monz, C. (2015). A Distributed Inflection Model for Translating into Morphologically Rich Languages. MT-Summit-2015, 1, 145–159.

A Distributed Inflection Model for Translating into Morphologically Rich Languages

Abstract

Cite

Register to see more suggestions