Automatic alignment in parallel corpora

Harris Papageorgiou; Lambros Cranias; Stelios Piperidis

Conference Proceedings

Automatic alignment in parallel corpora

Proceedings of the Annual Meeting of the Association for Computational Linguistics (1994) 1994-June 334-336

DOI: 10.3115/981732.981784

18Citations

78Readers

Get full text

Abstract

This paper addresses the alignment issue in the framework of exploitation of large bi-multilingual corpora for translation purposes. A generic alignment scheme is proposed that can meet varying requirements of different applications. Depending on the level at which alignment is sought, appropriate surface linguistic information is invoked coupled with information about possible unit delimiters. Each text unit (sentence, clause or phrase) is represented by the sum of its content tags. The results are then fed into a dynamic programming framework that computes the optimum alignment of units. The proposed scheme has been tested at sentence level on parallel corpora of the CELEX database. The success rate exceeded 99%. The next steps of the work concern the testing of the scheme's efficiency at lower levels endowed with necessary bilingual information about potential delimiters.

Cite

CITATION STYLE

APA

Papageorgiou, H., Cranias, L., & Piperidis, S. (1994). Automatic alignment in parallel corpora. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1994-June, pp. 334–336). Association for Computational Linguistics (ACL). https://doi.org/10.3115/981732.981784

Automatic alignment in parallel corpora

Abstract

Cite

Register to see more suggestions