Distances between distributions: Comparing language models

Thierry Murgue; Colin De La Higuera

Journal ArticleOPEN ACCESS

Distances between distributions: Comparing language models

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2004) 3138 269-277

DOI: 10.1007/978-3-540-27868-9_28

3Citations

19Readers

Abstract

Language models are used in a variety of fields in order to support other tasks: classification, next-symbol prediction, pattern analysis. In order to compare language models, or to measure the quality of an acquired model with respect to an empirical distribution, or to evaluate the progress of a learning process, we propose to use distances based on the L2 norm, or quadratic distances. We prove that these distances can not only be estimated through sampling, but can be effectively computed when both distributions are represented by stochastic deterministic finite automata. We provide a set of experiments showing a fast convergence of the distance through sampling and a good scalability, enabling us to use this distance to decide if two distributions are equal when only samples are provided, or to classify texts. © Springer-Verlag 2004.

Cite

CITATION STYLE

APA

Murgue, T., & De La Higuera, C. (2004). Distances between distributions: Comparing language models. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3138, 269–277. https://doi.org/10.1007/978-3-540-27868-9_28

Distances between distributions: Comparing language models

Abstract

Cite

Register to see more suggestions