A joint source-channel model for machine transliteration

Haizhou Li; Min Zhang; Jian Su

Conference ProceedingsOPEN ACCESS

A joint source-channel model for machine transliteration

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2004) 159-166

DOI: 10.3115/1218955.1218976

197Citations

97Readers

Abstract

Most foreign names are transliterated into Chinese, Japanese or Korean with approximate phonetic equivalents. The transliteration is usually achieved through intermediate phonemic mapping. This paper presents a new framework that allows direct orthographical mapping (DOM) between two different languages, through a joint source-channel model, also called n-gram transliteration model (TM). With the n-gram TM model, we automate the orthographic alignment process to derive the aligned transliteration units from a bilingual dictionary. The n-gram TM under the DOM framework greatly reduces system development effort and provides a quantum leap in improvement in transliteration accuracy over that of other state-of-the-art machine learning algorithms. The modeling framework is validated through several experiments for English-Chinese language pair.

Cite

CITATION STYLE

APA

Li, H., Zhang, M., & Su, J. (2004). A joint source-channel model for machine transliteration. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 159–166). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1218955.1218976

A joint source-channel model for machine transliteration

Abstract

Cite

Register to see more suggestions