Modelling lexical redundancy for machine translation

David Talbot; Miles Osborne

Conference ProceedingsOPEN ACCESS

Modelling lexical redundancy for machine translation

COLING/ACL 2006 - 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (2006) 1 969-976

DOI: 10.3115/1220175.1220297

20Citations

100Readers

Abstract

Certain distinctions made in the lexicon of one language may be redundant when translating into another language. We quantify redundancy among source types by the similarity of their distributions over target types. We propose a language-independent framework for minimising lexical redundancy that can be optimised directly from parallel text. Optimisation of the source lexicon for a given target language is viewed as model selection over a set of cluster-based translation models. Redundant distinctions between types may exhibit monolingual regularities, for example, inflexion patterns. We define a prior over model structure using a Markov random field and learn features over sets of monolingual types that are predictive of bilingual redundancy. The prior makes model selection more robust without the need for language-specific assumptions regarding redundancy. Using these models in a phrase-based SMT system, we show significant improvements in translation quality for certain language pairs. © 2006 Association for Computational Linguistics.

Cite

CITATION STYLE

APA

Talbot, D., & Osborne, M. (2006). Modelling lexical redundancy for machine translation. In COLING/ACL 2006 - 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Vol. 1, pp. 969–976). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1220175.1220297

Modelling lexical redundancy for machine translation

Abstract

Cite

Register to see more suggestions