Unifying Models for Word Length Distributions Based on Types and Tokens

Peter Zörnig; Thomas Berg

Journal Article

Unifying Models for Word Length Distributions Based on Types and Tokens

Journal of Quantitative Linguistics (2023) 30(2) 167-182

DOI: 10.1080/09296174.2023.2202061

0Citations

1Readers

Get full text

Abstract

Word length studies have been one of the central issues in Quantitative Linguistics for a long time. Most models were constructed for very specific purposes, i.e. the individual models apply only to a specific language, only to token counts or only to type counts. The present paper takes up the challenge of developing unifying models which account for both type and token frequencies of a moderately large sample of languages (eight Indo-European and two non-Indo-European languages). We introduce three models which can be well fitted to all our data: the exponentiated Hyper-Poisson distribution, the generalized gamma and the Sichel distribution. We also discuss the possibility of interpreting the model parameters linguistically.

Cite

CITATION STYLE

APA

Zörnig, P., & Berg, T. (2023). Unifying Models for Word Length Distributions Based on Types and Tokens. Journal of Quantitative Linguistics, 30(2), 167–182. https://doi.org/10.1080/09296174.2023.2202061

Unifying Models for Word Length Distributions Based on Types and Tokens

Abstract

Cite

Register to see more suggestions