Detecting transliterated orthographic variants via two similarity metrics

3Citations
Citations of this article
69Readers
Mendeley users who have this article in their library.

Abstract

We propose a detection method for orthographic variants caused by transliteration in a large corpus. The method employs two similarities. One is string similarity based on edit distance. The other is contextual similarity by a vector space model. Experimental results show that the method performed a 0.889 F-measure in an open test.

Cite

CITATION STYLE

APA

Ohtake, K., Sekiguchi, Y., & Yamamoto, K. (2004). Detecting transliterated orthographic variants via two similarity metrics. In COLING 2004 - Proceedings of the 20th International Conference on Computational Linguistics. Association for Computational Linguistics (ACL). https://doi.org/10.3115/1220355.1220457

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free