Improving domain-specific word alignment with a general bilingual corpus

Hua Wu; Haifeng Wang

Journal Article

Improving domain-specific word alignment with a general bilingual corpus

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2004) 3265 262-271

DOI: 10.1007/978-3-540-30194-3_29

9Citations

43Readers

Get full text

Abstract

In conventional word alignment methods, some employ statistical models or statistical measures, which need large-scale bilingual sentencealigned training corpora. Others employ dictionaries to guide alignment selection. However, these methods achieve unsatisfactory alignment results when performing word alignment on a small-scale domain-specific bilingual corpus without terminological lexicons. This paper proposes an approach to improve word alignment in a specific domain, in which only a small-scale domain-specific corpus is available, by adapting the word alignment information in the general domain to the specific domain. This approach first trains two statistical word alignment models with the large-scale corpus in the general domain and the small-scale corpus in the specific domain respectively, and then improves the domain-specific word alignment with these two models. Experimental results show a significant improvement in terms of both alignment precision and recall, achieving a relative error rate reduction of 21.96% as compared with state-of-the-art technologies. © Springer-Verlag.

Cite

CITATION STYLE

APA

Wu, H., & Wang, H. (2004). Improving domain-specific word alignment with a general bilingual corpus. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3265, 262–271. https://doi.org/10.1007/978-3-540-30194-3_29

Improving domain-specific word alignment with a general bilingual corpus

Abstract

Cite

Register to see more suggestions