Evaluating language models within a predictive framework: An analysis of ranking distributions

Pierre Alain; Olivier Boëffard; Nelly Barbot

Conference Proceedings

Evaluating language models within a predictive framework: An analysis of ranking distributions

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2006) 4188 LNCS 319-326

DOI: 10.1007/11846406_40

0Citations

1Readers

Get full text

Abstract

Perplexity is a widely used criterion in order to compare language models without any task assumptions. However, the main drawback is that perplexity supposes probability distributions and hence cannot compare heterogeneous models. As an evaluation framework, we propose in this article to abandon perplexity and to extend the Shannon's entropy idea which is based on model prediction performance using rank based statistics. Our methodology is able to predict joint word sequences being independent of the task or model assumptions. Experiments are carried out on the English language with different kind of language models. We show that long-term prediction language models are not more effective than the standard n-gram models. Ranking distributions follow exponential laws as already observed in predicting letter sequences. These distributions show a second mode not observed with letters and we propose to give some interpretation to this mode in this article. © Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Alain, P., Boëffard, O., & Barbot, N. (2006). Evaluating language models within a predictive framework: An analysis of ranking distributions. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4188 LNCS, pp. 319–326). Springer Verlag. https://doi.org/10.1007/11846406_40

Evaluating language models within a predictive framework: An analysis of ranking distributions

Abstract

Cite

Register to see more suggestions