Transliteration Better than Translation? Answering Code-mixed Questions over a Knowledge Base

9Citations
Citations of this article
98Readers
Mendeley users who have this article in their library.

Abstract

Humans can learn multiple languages. If they know a fact in one language, they can answer a question in another language they understand. They can also answer Code-mix (CM) questions: questions which contain both languages. This ability is attributed to the unique learning ability of humans. Our task aims to study if machines can achieve this. We demonstrate how effectively a machine can answer CM questions. In this work, we adopt a two-step approach: candidate generation and candidate re-ranking to answer questions. We propose a Triplet-Siamese-Hybrid CNN (TSHCNN) to re-rank candidate answers. We show experiments on the SimpleQuestions dataset. Our network is trained only on English questions provided in this dataset and noisy Hindi translations of these questions and can answer English-Hindi CM questions effectively without the need of translation into English. Back-transliterated CM questions outperform their lexical and sentence level translated counterparts by 5% & 35% respectively, highlighting the efficacy of our approach in a resource-constrained setting.

Cite

CITATION STYLE

APA

Gupta, V., Chinnakotla, M., & Shrivastava, M. (2018). Transliteration Better than Translation? Answering Code-mixed Questions over a Knowledge Base. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 39–50). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w18-3205

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free