Abstract
We examine various string distance measures for suitability in modeling dialect distance, especially its perception. We find measures superior which do not normalize for word length, but which are are sensitive to order. We likewise find evidence for the superiority of measures which incorporate a sensitivity to phonological context, realized in the form of n-grams-although we cannot identify which form of context (bigram, trigram, etc.) is best. However, we find no clear benefit in using gradual as opposed to binary segmental difference when calculating sequence distances.
Cite
CITATION STYLE
Heeringa, W., Kleiweg, P., Gooskens, C., & Nerbonne, J. (2006). Evaluation of String Distance Algorithms for Dialectology. In COLING ACL 2006 - Linguistic Distances, Proceedings of the Workshop (pp. 51–62). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1641976.1641984
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.