A hybrid sentence splitting method by comma insertion for machine translation with CRF

0Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

When writing formal articles many English writers often use long sentences with few punctuation marks. Since long sentences bring difficulty to machine translation systems, many researchers try to split them using punctuation marks before translation. But dealing with sentences with few punctuation marks is still intractable. In this paper we use a log linear model to insert commas into proper positions to split long sentence, trying to shorten the length of sub-sentence and benefit to machine translation. Experiment results show that our method can reasonably segment long sentences, and improve the quality of machine translation.

Cite

CITATION STYLE

APA

Yang, S., Feng, C., & Huang, H. (2015). A hybrid sentence splitting method by comma insertion for machine translation with CRF. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9427, pp. 141–152). Springer Verlag. https://doi.org/10.1007/978-3-319-25816-4_12

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free