Abstract
Statistical machine translation techniques offer great promise for the development of automatic translation systems. However, the realization of this potential requires the availability of significant amounts of parallel bilingual texts. This paper reports on an attempt to reduce the amount of text that is required to obtain an acceptable translation system, through the use of active and semisupervised learning. Systems were built using resources collected from South African government websites and the results evaluated using a standard automatic evaluation metric (BLEU). We show that significant improvements in translation quality can be achieved with very limited parallel corpora, and that both active learning and semi-supervised learning are useful in this context.
Author supplied keywords
Cite
CITATION STYLE
Kato, R. S. M., & Barnard, E. (2007). Statistical translation with scarce resources: A south african case study. SAIEE Africa Research Journal, 98(4), 136–140. https://doi.org/10.23919/saiee.2007.9485635
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.