The determination of recurrent sound correspondences between languages is crucial for the identification of cognates, which are often employed in statistical machine translation for sentence and word alignment. In this paper, an algorithm designed for extracting non-compositional compounds from bitexts is shown to be capable of determining complex sound correspondences in bilingual wordlists. In experimental evaluation, a C++ implementation of the algorithm achieves approximately 90% recall and precision on authentic language data.
CITATION STYLE
Kondrak, G. (2003). Identifying complex sound correspondences in bilingual wordlists. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2588, pp. 432–443). Springer Verlag. https://doi.org/10.1007/3-540-36456-0_46
Mendeley helps you to discover research relevant for your work.