Tagsets and Datasets: Some Experiments Based on Portuguese Language

Cláudia Freitas; Luiza F. Trugo; Fabricio Chalub; Guilherme Paulino-Passos; Alexandre Rademaker

Conference Proceedings

Tagsets and Datasets: Some Experiments Based on Portuguese Language

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11122 LNAI 459-469

DOI: 10.1007/978-3-319-99722-3_46

2Citations

6Readers

Get full text

Abstract

We report the results of two experiments aimed at investigating the impact of linguistic variation on PoS tagging. In both cases, we depart from the conversion of the corpus MacMorpho [1], which was re-annotated according to the Universal Dependencies PoS tagset. Throughout the conversion process, we faced some linguistic challenges related to the past participle forms. As a result, we created two corpora (MacMoprho-UD and MacMorpho-UD+PCP). We used these three corpora (MacMorpho; MacMoprho-UD and MacMorpho-UD+PCP) to assess the impact on PoS learning in different scenarios.

Author supplied keywords

Cite

CITATION STYLE

APA

Freitas, C., Trugo, L. F., Chalub, F., Paulino-Passos, G., & Rademaker, A. (2018). Tagsets and Datasets: Some Experiments Based on Portuguese Language. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11122 LNAI, pp. 459–469). Springer Verlag. https://doi.org/10.1007/978-3-319-99722-3_46

Tagsets and Datasets: Some Experiments Based on Portuguese Language

Abstract

Author supplied keywords

Cite

Register to see more suggestions