Towards the improvement of statistical translation models using linguistic features

Alicia Pérez; Inés Torres; Francisco Casacuberta

Conference Proceedings

Towards the improvement of statistical translation models using linguistic features

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2006) 4139 LNAI 716-725

DOI: 10.1007/11816508_71

4Citations

3Readers

Get full text

Abstract

Statistical translation models can be inferred from bilingual samples whenever enough training data are available. However, bilingual corpora are usually too scarce resources so as to get reliable statistical models, particularly, when we are dealing with very inflected languages, or with agglutinative languages, where many words appear just once. Such events often distort the statistics. In order to cope with this problem, we have turned to morphological knowledge. Instead of dealing directly with running words, we also take advantage of lemmas, thus, producing the translation in two stages. In the first stage we transform the source sentence into a lemmatized target sentence, and in the second stage we convert the lemmatized target sentence into the target full forms. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Pérez, A., Torres, I., & Casacuberta, F. (2006). Towards the improvement of statistical translation models using linguistic features. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4139 LNAI, pp. 716–725). Springer Verlag. https://doi.org/10.1007/11816508_71

Towards the improvement of statistical translation models using linguistic features

Abstract

Cite

Register to see more suggestions