An Expectation Maximization Algorithm for Textual Unit Alignment

ISSN: 0736587X
12Citations
Citations of this article
66Readers
Mendeley users who have this article in their library.

Abstract

The paper presents an Expectation Maximization (EM) algorithm for automatic generation of parallel and quasi-parallel data from any degree of comparable corpora ranging from parallel to weakly comparable. Specifically, we address the problem of extracting related textual units (documents, paragraphs or sentences) relying on the hypothesis that, in a given corpus, certain pairs of translation equivalents are better indicators of a correct textual unit correspondence than other pairs of translation equivalents. We evaluate our method on mixed types of bilingual comparable corpora in six language pairs, obtaining state of the art accuracy figures.

Cite

CITATION STYLE

APA

Ion, R., Ceauşu, A., & Irimia, E. (2011). An Expectation Maximization Algorithm for Textual Unit Alignment. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 128–135). Association for Computational Linguistics (ACL).

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free