Using word formation rules to extend MT lexicons

Claudia Gdaniec; Esmé Manandise

Conference Proceedings

Using word formation rules to extend MT lexicons

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2002) 2499 64-73

DOI: 10.1007/3-540-45820-4_7

1Citations

33Readers

Get full text

Abstract

In the IBM LMT Machine Translation (MT) system, a built-in strategy provides lexical coverage of a particular subset of words that are not listed in its bilingual lexicons. The recognition and coding of these words and their transfer generation is based on a set of derivational morphological rules. A new utility extends unfound words of this type in an LMT-compatible format in an auxiliary bilingual lexical file to be subsequently merged into the core lexicons. What characterizes this approach is the use of morphological, semantic, and syntactic features for both analysis and transfer. The auxiliary lexical file (ALF) has to be revised before a merge into the core lexicons. This utility integrates a linguistics-based analysis and transfer rules with a corpus-based method of verifying or falsifying linguistic hypotheses against extensive document translation, which in addition yields statistics on frequencies of occurrence as well as local context.

Cite

CITATION STYLE

APA

Gdaniec, C., & Manandise, E. (2002). Using word formation rules to extend MT lexicons. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2499, pp. 64–73). Springer Verlag. https://doi.org/10.1007/3-540-45820-4_7

Using word formation rules to extend MT lexicons

Abstract

Cite

Register to see more suggestions