Inferring paraphrases for a highly inflected language from a monolingual corpus

Kfir Bar; Nachum Dershowitz

Conference Proceedings

Inferring paraphrases for a highly inflected language from a monolingual corpus

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8404 LNCS(PART 2) 254-270

DOI: 10.1007/978-3-642-54903-8_22

1Citations

7Readers

Get full text

Abstract

We suggest a new technique for deriving paraphrases from a monolingual corpus, supported by a relatively small set of comparable documents. Two somewhat similar phrases that each occur in one of a pair of documents dealing with the same incident are taken as potential paraphrases, which are evaluated based on the contexts in which they appear in the larger monolingual corpus. We apply this technique to Arabic, a highly inflected language, for improving an Arabic-to-English statistical translation system. The paraphrases are provided to the translation system formatted as a word lattice, each assigned with a score reflecting its equivalence level. We experiment with the system on different configurations, resulting in encouraging results: our best system shows an increase of 1.73 (5.49%) in BLEU. © 2014 Springer-Verlag Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Bar, K., & Dershowitz, N. (2014). Inferring paraphrases for a highly inflected language from a monolingual corpus. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8404 LNCS, pp. 254–270). Springer Verlag. https://doi.org/10.1007/978-3-642-54903-8_22

Inferring paraphrases for a highly inflected language from a monolingual corpus

Abstract

Author supplied keywords

Cite

Register to see more suggestions