Exploring asymmetric clustering for statistical language modeling

9Citations
Citations of this article
79Readers
Mendeley users who have this article in their library.

Abstract

The n-gram model is a stochastic model, which predicts the next word (predicted word) given the previous words (conditional words) in a word sequence. The cluster n-gram model is a variant of the n-gram model in which similar words are classified in the same cluster. It has been demonstrated that using different clusters for predicted and conditional words leads to cluster models that are superior to classical cluster models which use the same clusters for both words. This is the basis of the asymmetric cluster model (ACM) discussed in our study. In this paper, we first present a formal definition of the ACM. We then describe in detail the methodology of constructing the ACM. The effectiveness of the ACM is evaluated on a realistic application, namely Japanese Kana-Kanji conversion. Experimental results show substantial improvements of the ACM in comparison with classical cluster models and word n-gram models at the same model size. Our analysis shows that the high-performance of the ACM lies in the asymmetry of the model.

Cite

CITATION STYLE

APA

Gao, J., Goodman, J. T., Cao, G., & Li, H. (2002). Exploring asymmetric clustering for statistical language modeling. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 2002-July, pp. 183–190). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1073083.1073115

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free