Extraction of name and transliteration in monolingual and parallel corpora

5Citations
Citations of this article
36Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Named-entities in free text represent a challenge to text analysis in Machine Translation and Cross Language Information Retrieval. These phrases are often transliterated into another language with a different sound inventory and writing system. Named-entities found in free text are often not listed in bilingual dictionaries. Although it is possible to identify and translate named-entities on the fly without a list of proper names and transliterations, an extensive list of existing transliterations certainly will ensure high precision rate. We use a seed list of proper names and transliterations to train a Machine Transliteration Model. With the model it is possible to extract proper names and their transliterations in monolingual or parallel corpora with high precision and recall rates. © Springer-Verlag 2004.

Cite

CITATION STYLE

APA

Lin, T., Wu, J. C., & Chang, J. S. (2004). Extraction of name and transliteration in monolingual and parallel corpora. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3265, 177–186. https://doi.org/10.1007/978-3-540-30194-3_20

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free