Any cross-language processing application has to first tackle the problem of transliteration when facing a language using another script. The first solution consists of using existing transliteration tools, but these tools are not usually suitable for all purposes. For some specific script pairs they do not even exist. Our aim is to discriminate transliterations across different scripts in a unified way using a learning method that builds a transliteration model out of a set of transliterated proper names. We compare two strings using an algorithm that builds a Levenshtein edit distance using n-grams costs. The evaluations carried out show that our similarity measure is accurate. © 2008 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Pouliquen, B. (2008). Similarity of names across scripts: Edit distance using learned costs of N-grams. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5221 LNAI, pp. 405–416). https://doi.org/10.1007/978-3-540-85287-2_39
Mendeley helps you to discover research relevant for your work.