Prior work has shown that generalization of data in an Example Based Machine Translation (EBMT) system, reduces the amount of pre-translated text required to achieve a certain level of accuracy (Brown, 2000). Several word clustering algorithms have been suggested to perform these generalizations, such as kMeans clustering or Group Average Clustering. The hypothesis is that better contextual clustering can lead to better translation accuracy with limited training data. In this paper, we use a form of spectral clustering to cluster words, and this is shown to result in as much as 29.08% improvement over the baseline EBMT system.
Mendeley helps you to discover research relevant for your work.
CITATION STYLE
Gangadharaiah, R., Brown, R., & Carbonell, J. (2006). Spectral clustering for example based machine translation. In HLT-NAACL 2006 - Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, Short Papers (pp. 41–44). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1614049.1614060