Language-independent bilingual terminology extraction from a multilingual parallel corpus

Els Lefever; Lieve Macken; Veronique Hoste

Conference Proceedings

Language-independent bilingual terminology extraction from a multilingual parallel corpus

EACL 2009 - 12th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings (2009) 496-504

DOI: 10.3115/1609067.1609122

39Citations

103Readers

Get full text

Abstract

We present a language-pair independent terminology extraction module that is based on a sub-sentential alignment system that links linguistically motivated phrases in parallel texts. Statistical filters are applied on the bilingual list of candidate terms that is extracted from the alignment output. We compare the performance of both the alignment and terminology extraction module for three different language pairs (French-English, French-Italian and French-Dutch) and highlight language-pair specific problems (e.g. different compounding strategy in French and Dutch). Comparisons with standard terminology extraction programs show an improvement of up to 20% for bilingual terminology extraction and competitive results (85% to 90% accuracy) for monolingual terminology extraction, and reveal that the linguistically based alignment module is particularly well suited for the extraction of complex multiword terms. © 2009 Association for Computational Linguistics.

Cite

CITATION STYLE

APA

Lefever, E., Macken, L., & Hoste, V. (2009). Language-independent bilingual terminology extraction from a multilingual parallel corpus. In EACL 2009 - 12th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings (pp. 496–504). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1609067.1609122

Language-independent bilingual terminology extraction from a multilingual parallel corpus

Abstract

Cite

Register to see more suggestions