Tagging Portuguese with a Spanish Tagger Using Cognates

13Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.

Abstract

We describe a knowledge and resource light system for an automatic morphological analysis and tagging of Brazilian Portuguese.1 We avoid the use of labor intensive resources; particularly, large annotated corpora and lexicons. Instead, we use (i) an annotated corpus of Peninsular Spanish, a language related to Portuguese, (ii) an unannotated corpus of Portuguese, (iii) a description of Portuguese morphology on the level of a basic grammar book. We extend the similar work that we have done (Hana et al., 2004; Feldman et al., 2006) by proposing an alternative algorithm for cognate transfer that effectively projects the Spanish emission probabilities into Portuguese. Our experiments use minimal new human effort and show 21% error reduction over even emissions on a fine-grained tagset.

Cite

CITATION STYLE

APA

Hana, J., Feldman, A., Brew, C., & Amaral, L. (2006). Tagging Portuguese with a Spanish Tagger Using Cognates. In Cross-Language Knowledge Induction Workshop - International Workshop held as part of EACL 2006: 11th Conference of the European Chapter of the Association for Computational Linguistics (pp. 33–40). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1608842.1608847

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free