Polyphon: An algorithm for phonetic string matching in Russian language

8Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Data cleansing is the crucial matter in business intelligence. We propose a new phonetic algorithm to string matching in Russian language without transliteration from Cyrillic to Latin characters. It is based on the rules of sounds formation in Russian language. Additionally, we consider an extended algorithm for matching of Cyrillic strings where phonetic code letters are presented as primes, and the code of a string is the sum of these numbers. Experimental results show that our algorithms allow accurately matching phonetically similar strings in Russian language.

Cite

CITATION STYLE

APA

Paramonov, V. V., Shigarov, A. O., Ruzhnikov, G. M., & Belykh, P. V. (2016). Polyphon: An algorithm for phonetic string matching in Russian language. In Communications in Computer and Information Science (Vol. 639, pp. 568–579). Springer Verlag. https://doi.org/10.1007/978-3-319-46254-7_46

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free