Automated Mining Of Names Using Parallel Hindi-English Corpus

7Citations
Citations of this article
71Readers
Mendeley users who have this article in their library.

Abstract

Machine transliteration has a number of applications in a variety of natural language processing related tasks such as machine translation, information retrieval and question-answering. For automated learning of machine transliteration, a large parallel corpus of names in two scripts is required. In this paper we present a simple yet powerful method for automatic mining of Hindi-English names from a parallel corpus. An average 93% precision and 85% recall is achieved in mining of proper names. The method works even with a small corpus. We compare our results with Giza++ word alignment tool that yields 30% precision and 63% recall on the same corpora. We also demonstrate that this very method of name mining works for other Indian languages as well.

Cite

CITATION STYLE

APA

Mahesh, R., & Sinha, K. (2009). Automated Mining Of Names Using Parallel Hindi-English Corpus. In Proceedings of the 7th Workshop on Asian Language Resources, ALR 2009 - in conjunction with the Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (pp. 48–54). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1690299.1690306

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free