Using Information about Multi-word Expressions for the Word-Alignment Task

Sriram Venkatapathy; Aravind K. Joshi

Conference Proceedings

Using Information about Multi-word Expressions for the Word-Alignment Task

COLING ACL 2006 - Multiword Expressions: Identifying and Exploiting Underlying Properties, Proceedings of the Workshop (2006) 20-27

DOI: 10.3115/1613692.1613697

18Citations

88Readers

Get full text

Abstract

It is well known that multi-word expressions are problematic in natural language processing. In previous literature, it has been suggested that information about their degree of compositionality can be helpful in various applications but it has not been proven empirically. In this paper, we propose a framework in which information about the multi-word expressions can be used in the word-alignment task. We have shown that even simple features like point-wise mutual information are useful for word-alignment task in English-Hindi parallel corpora. The alignment error rate which we achieve (AER = 0.5040) is significantly better (about 10% decrease in AER) than the alignment error rates of the state-of-art models (Och and Ney, 2003) (Best AER = 0.5518) on the English-Hindi dataset.

Cite

CITATION STYLE

APA

Venkatapathy, S., & Joshi, A. K. (2006). Using Information about Multi-word Expressions for the Word-Alignment Task. In COLING ACL 2006 - Multiword Expressions: Identifying and Exploiting Underlying Properties, Proceedings of the Workshop (pp. 20–27). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1613692.1613697

Using Information about Multi-word Expressions for the Word-Alignment Task

Abstract

Cite

Register to see more suggestions