In this paper, we present an approach to automatically revealing phonological correspondences within historically related languages. We create two bilingual pronunciation dictionaries for the language pairs German-Dutch and German-English. The data is used for automatically learning phonological similarities between the two language pairs via EM-based clustering. We apply our models to predict from a phonological German word the phonemes of a Dutch and an English cognate. The similarity scores show that German and Dutch phonemes are more similar than German and English phonemes, which supplies statistical evidence of the common knowledge that German is more closely related to Dutch than to English. We assess our approach qualitatively, finding meaningful classes caused by historical sound changes. The classes can be used for language learning.
CITATION STYLE
Müller, K. (2005). Revealing phonological similarities between related languages from automatically generated parallel corpora. In Texts@ACL 2005 - Building and Using Parallel Texts: Data-Driven Machine Translation and Beyond, Proceedings of the Workshop (pp. 33–40). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1654449.1654455
Mendeley helps you to discover research relevant for your work.