Improved smoothing for N-gram language models based on ordinary counts

Robert C. Moore; Chris Quirk

Conference Proceedings

Improved smoothing for N-gram language models based on ordinary counts

ACL-IJCNLP 2009 - Joint Conf. of the 47th Annual Meeting of the Association for Computational Linguistics and 4th Int. Joint Conf. on Natural Language Processing of the AFNLP, Proceedings of the Conf. (2009) 349-352

DOI: 10.3115/1667583.1667691

8Citations

97Readers

Get full text

Abstract

Kneser-Ney (1995) smoothing and its variants are generally recognized as having the best perplexity of any known method for estimating N-gram language models. Kneser-Ney smoothing, however, requires nonstandard N-gram counts for the lowerorder models used to smooth the highestorder model. For some applications, this makes Kneser-Ney smoothing inappropriate or inconvenient. In this paper, we introduce a new smoothing method based on ordinary counts that outperforms all of the previous ordinary-count methods we have tested, with the new method eliminating most of the gap between Kneser-Ney and those methods. © 2009 ACL and AFNLP.

Cite

CITATION STYLE

APA

Moore, R. C., & Quirk, C. (2009). Improved smoothing for N-gram language models based on ordinary counts. In ACL-IJCNLP 2009 - Joint Conf. of the 47th Annual Meeting of the Association for Computational Linguistics and 4th Int. Joint Conf. on Natural Language Processing of the AFNLP, Proceedings of the Conf. (pp. 349–352). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1667583.1667691

Improved smoothing for N-gram language models based on ordinary counts

Abstract

Cite

Register to see more suggestions