Phonotactic complexity and its trade-offs

30Citations
Citations of this article
94Readers
Mendeley users who have this article in their library.

Abstract

We present methods for calculating a measure of phonotactic complexity—bits per phoneme— that permits a straightforward cross-linguistic comparison. When given a word, represented as a sequence of phonemic segments such as symbols in the international phonetic alphabet, and a statistical model trained on a sample of word types from the language, we can approximately measure bits per phoneme using the negative log-probability of that word under the model. This simple measure allows us to compare the entropy across languages, giving insight into how complex a language’s phonotactics is. Using a collection of 1016 basic concept words across 106 languages, we demonstrate a very strong negative correlation of −0.74 between bits per phoneme and the average length of words.

Cite

CITATION STYLE

APA

Pimentel, T., Roark, B., & Cotterell, R. (2020). Phonotactic complexity and its trade-offs. Transactions of the Association for Computational Linguistics, 8, 1–18. https://doi.org/10.1162/tacl_a_00296

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free