Identification of bilingual suffix classes for classification and translation generation

1Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We examine the possibility of learning bilingual morphology using the translation forms taken from an existing, manually validated, bilingual translation lexicon. The objective is to evaluate the use of bilingual stem and suffix based features on the performance of the existing Support Vector Machine based classifier trained to classify the automatically extractedword-to-word translations.We initially induce the bilingual stem and suffix correspondences by considering the longest sequence common to orthogonally similar translations. Clusters of stem-pairs characterised by identical suffix-pairs are formed, which are then used to generate out-ofvocabulary translations that are identical to, but different from, the previously existing translations, thereby completing the existing lexicon. Using the bilingual stem and suffix correspondences induced from the augmented lexicon we come up with 5 new features that reflects the (non)existence of morphological coverage (agreement) between a term and its translation. Specifically, we examine and evaluate the use of suffix classes, bilingual stem and suffix correspondences as features in selecting correct word-toword translations from among the automatically extracted ones. With a training data of approximately 35.8K word translations for the language pair English-Portuguese, we identified around 6.4K unique stem pairs and 0.25K uniquesuffixpairs.Further, experimental results showthat the newly added features improvedtheword-to-word classification accuracyby 9.11% leading to an overall improvement in the classifier accuracy by 2.15% when all translations (single- and multi-word translations) were considered.

Cite

CITATION STYLE

APA

Kavitha, K. M., Gomes, L., & Lopes, J. G. P. (2014). Identification of bilingual suffix classes for classification and translation generation. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8864, 154–166. https://doi.org/10.1007/978-3-319-12027-0_13

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free