Perplexity of n-gram and dependency language models

10Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Language models (LMs) are essential components of many applications such as speech recognition or machine translation. LMs factorize the probability of a string of words into a product of P(w i |h i ), where h i is the context (history) of word w i . Most LMs use previous words as the context. The paper presents two alternative approaches: post-ngram LMs (which use following words as context) and dependency LMs (which exploit dependency structure of a sentence and can use e.g. the governing word as context). Dependency LMs could be useful whenever a topology of a dependency tree is available, but its lexical labels are unknown, e.g. in tree-to-tree machine translation. In comparison with baseline interpolated trigram LM both of the approaches achieve significantly lower perplexity for all seven tested languages (Arabic, Catalan, Czech, English, Hungarian, Italian, Turkish). © 2010 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Popel, M., & Mareček, D. (2010). Perplexity of n-gram and dependency language models. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6231 LNAI, pp. 173–180). https://doi.org/10.1007/978-3-642-15760-8_23

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free