Abstract
In this paper, a variant of a spectral clustering algorithm is proposed for bilingual word clustering. The proposed algorithm generates the two sets of clusters for both languages efficiently with high semantic correlation within monolingual clusters, and high translation quality across the clusters between two languages. Each cluster level translation is considered as a bilingual concept, which generalizes words in bilingual clusters. This scheme improves the robustness for statistical machine translation models. Two HMM-based translation models are tested to use these bilingual clusters. Improved perplexity, word alignment accuracy, and translation quality are observed in our experiments.
Cite
CITATION STYLE
Zhao, B., Xing, E. P., & Waibel, A. (2005). Bilingual word spectral clustering for statistical machine translation. In Texts@ACL 2005 - Building and Using Parallel Texts: Data-Driven Machine Translation and Beyond, Proceedings of the Workshop (pp. 25–32). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1654449.1654454
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.