Translational equivalence in Statistical Machine Translation or meaning as co-occurrence

Lieve Macken; Els Lefever

Journal ArticleOPEN ACCESS

Translational equivalence in Statistical Machine Translation or meaning as co-occurrence

Linguistica Antverpiensia, New Series – Themes in Translation Studies (2008) 7 193-208

DOI: 10.52034/LANSTTS.V7I.215

1Citations

10Readers

Abstract

In this paper, we will describe the current state-of-the-art of Statistical Machine Translation (SMT), and reflect on how SMT handles meaning. Statistical Machine Translation is a corpus-based approach to MT: it derives the required knowledge to generate new translations from corpora. General-purpose SMT systems do not use any formal semantic representation. Instead, they directly extract translationally equivalent words or word sequences – expressions with the same meaning – from bilingual parallel corpora. All statistical translation models are based on the idea of word alignment, i.e., the automatic linking of corresponding words in parallel texts. The first generation SMT systems were word-based. From a linguistic point of view, the major problem with word-based systems is that the meaning of a word is often ambiguous, and is determined by its context. Current state-of-the-art SMT-systems try to capture the local contextual dependencies by using phrases instead of words as units of translation. In order to solve more complex ambiguity problems (where a broader text scope or even domain information is needed), a Word Sense Disambiguation (WSD) module is integrated in the Machine Translation environment.

Cite

CITATION STYLE

APA

Macken, L., & Lefever, E. (2008). Translational equivalence in Statistical Machine Translation or meaning as co-occurrence. Linguistica Antverpiensia, New Series – Themes in Translation Studies, 7, 193–208. https://doi.org/10.52034/LANSTTS.V7I.215

Translational equivalence in Statistical Machine Translation or meaning as co-occurrence

Abstract

Cite

Register to see more suggestions