Evaluation of Several Phonetic Similarity Algorithms on the Task of Cognate Identification

Grzegorz Kondrak; Tarek Sherif

Conference ProceedingsOPEN ACCESS

Evaluation of Several Phonetic Similarity Algorithms on the Task of Cognate Identification

COLING ACL 2006 - Linguistic Distances, Proceedings of the Workshop (2006) 43-50

DOI: 10.3115/1641976.1641983

38Citations

101Readers

Abstract

We investigate the problem of measuring phonetic similarity, focusing on the identification of cognates, words of the same origin in different languages. We compare representatives of two principal approaches to computing phonetic similarity: manually-designed metrics, and learning algorithms. In particular, we consider a stochastic transducer, a Pair HMM, several DBN models, and two constructed schemes. We test those approaches on the task of identifying cognates among Indoeuropean languages, both in the supervised and unsupervised context. Our results suggest that the averaged context DBN model and the Pair HMM achieve the highest accuracy given a large training set of positive examples.

Cite

CITATION STYLE

APA

Kondrak, G., & Sherif, T. (2006). Evaluation of Several Phonetic Similarity Algorithms on the Task of Cognate Identification. In COLING ACL 2006 - Linguistic Distances, Proceedings of the Workshop (pp. 43–50). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1641976.1641983

Evaluation of Several Phonetic Similarity Algorithms on the Task of Cognate Identification

Abstract

Cite

Register to see more suggestions