Tagsets and Datasets: Some Experiments Based on Portuguese Language

2Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We report the results of two experiments aimed at investigating the impact of linguistic variation on PoS tagging. In both cases, we depart from the conversion of the corpus MacMorpho [1], which was re-annotated according to the Universal Dependencies PoS tagset. Throughout the conversion process, we faced some linguistic challenges related to the past participle forms. As a result, we created two corpora (MacMoprho-UD and MacMorpho-UD+PCP). We used these three corpora (MacMorpho; MacMoprho-UD and MacMorpho-UD+PCP) to assess the impact on PoS learning in different scenarios.

Cite

CITATION STYLE

APA

Freitas, C., Trugo, L. F., Chalub, F., Paulino-Passos, G., & Rademaker, A. (2018). Tagsets and Datasets: Some Experiments Based on Portuguese Language. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11122 LNAI, pp. 459–469). Springer Verlag. https://doi.org/10.1007/978-3-319-99722-3_46

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free