Distances between distributions: Comparing language models

3Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Language models are used in a variety of fields in order to support other tasks: classification, next-symbol prediction, pattern analysis. In order to compare language models, or to measure the quality of an acquired model with respect to an empirical distribution, or to evaluate the progress of a learning process, we propose to use distances based on the L2 norm, or quadratic distances. We prove that these distances can not only be estimated through sampling, but can be effectively computed when both distributions are represented by stochastic deterministic finite automata. We provide a set of experiments showing a fast convergence of the distance through sampling and a good scalability, enabling us to use this distance to decide if two distributions are equal when only samples are provided, or to classify texts. © Springer-Verlag 2004.

Cite

CITATION STYLE

APA

Murgue, T., & De La Higuera, C. (2004). Distances between distributions: Comparing language models. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3138, 269–277. https://doi.org/10.1007/978-3-540-27868-9_28

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free