Log-linear models for uyghur segmentation in spoken language translation

1Citations
Citations of this article
68Readers
Mendeley users who have this article in their library.
Get full text

Abstract

To alleviate data sparsity in spoken Uyghur machine translation, we proposed a log-linear based morphological segmentation approach. Instead of learning model only from monolingual annotated corpus, this approach optimizes Uyghur segmentation for spoken translation based on both bilingual and monolingual corpus. Our approach relies on several features such as traditional conditional random field (CRF) feature, bilingual word alignment feature and monolingual suffix-word co-occurrence feature. Experimental results shown that our proposed segmentation model for Uyghur spoken translation achieved 1.6 BLEU score improvements compared with the state-of-the-art baseline.

Cite

CITATION STYLE

APA

Mi, C., Yang, Y., Dong, R., Zhou, X., Wang, L., Li, X., & Jiang, T. (2017). Log-linear models for uyghur segmentation in spoken language translation. In International Conference Recent Advances in Natural Language Processing, RANLP (Vol. 2017-September, pp. 492–500). Incoma Ltd. https://doi.org/10.26615/978-954-452-049-6_065

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free