We describe a knowledge and resource light system for an automatic morphological analysis and tagging of Brazilian Portuguese.1 We avoid the use of labor intensive resources; particularly, large annotated corpora and lexicons. Instead, we use (i) an annotated corpus of Peninsular Spanish, a language related to Portuguese, (ii) an unannotated corpus of Portuguese, (iii) a description of Portuguese morphology on the level of a basic grammar book. We extend the similar work that we have done (Hana et al., 2004; Feldman et al., 2006) by proposing an alternative algorithm for cognate transfer that effectively projects the Spanish emission probabilities into Portuguese. Our experiments use minimal new human effort and show 21% error reduction over even emissions on a fine-grained tagset.
CITATION STYLE
Hana, J., Feldman, A., Brew, C., & Amaral, L. (2006). Tagging Portuguese with a Spanish Tagger Using Cognates. In Cross-Language Knowledge Induction Workshop - International Workshop held as part of EACL 2006: 11th Conference of the European Chapter of the Association for Computational Linguistics (pp. 33–40). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1608842.1608847
Mendeley helps you to discover research relevant for your work.