PR + RQ ≈ PQ: Transliteration Mining Using Bridge Language

Mitesh M. Khapra; Raghavendra Udupa; A. Kumaran; Pushpak Bhattacharyya

Conference ProceedingsOPEN ACCESS

PR + RQ ≈ PQ: Transliteration Mining Using Bridge Language

Proceedings of the 24th AAAI Conference on Artificial Intelligence, AAAI 2010 (2010) 1346-1351

1Citations

8Readers

Abstract

We address the problem of mining name transliterations from comparable corpora in languages P and Q in the following resource-poor scenario: • Parallel names in P Q are not available for training. • Parallel names in P R and RQ are available for training. We propose a novel solution for the problem by computing a common geometric feature space for P, Q and R where name transliterations are mapped to similar vectors. We employ Canonical Correlation Analysis (CCA) to compute the common geometric feature space using only parallel names in P R and RQ and without requiring parallel names in P Q. We test our algorithm on data sets in several languages and show that it gives results comparable to the state-of-the-art transliteration mining algorithms that use parallel names in P Q for training.

Cite

CITATION STYLE

APA

Khapra, M. M., Udupa, R., Kumaran, A., & Bhattacharyya, P. (2010). PR + RQ ≈ PQ: Transliteration Mining Using Bridge Language. In Proceedings of the 24th AAAI Conference on Artificial Intelligence, AAAI 2010 (pp. 1346–1351). AAAI Press.

PR + RQ ≈ PQ: Transliteration Mining Using Bridge Language

Abstract

Cite

Register to see more suggestions