Evaluation of String Distance Algorithms for Dialectology

Wilbert Heeringa; Peter Kleiweg; Charlotte Gooskens; John Nerbonne

Conference Proceedings

Evaluation of String Distance Algorithms for Dialectology

COLING ACL 2006 - Linguistic Distances, Proceedings of the Workshop (2006) 51-62

DOI: 10.3115/1641976.1641984

62Citations

107Readers

Get full text

Abstract

We examine various string distance measures for suitability in modeling dialect distance, especially its perception. We find measures superior which do not normalize for word length, but which are are sensitive to order. We likewise find evidence for the superiority of measures which incorporate a sensitivity to phonological context, realized in the form of n-grams-although we cannot identify which form of context (bigram, trigram, etc.) is best. However, we find no clear benefit in using gradual as opposed to binary segmental difference when calculating sequence distances.

Cite

CITATION STYLE

APA

Heeringa, W., Kleiweg, P., Gooskens, C., & Nerbonne, J. (2006). Evaluation of String Distance Algorithms for Dialectology. In COLING ACL 2006 - Linguistic Distances, Proceedings of the Workshop (pp. 51–62). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1641976.1641984

Evaluation of String Distance Algorithms for Dialectology

Abstract

Cite

Register to see more suggestions