In this paper, we present an automatic system for the morphosyntactic annotation and lexicographical evaluation of historical Portuguese corpora. Using rule-based orthographical normalization, we were able to apply a standard parser (PALAVRAS) to historical data (Colonia corpus) and to achieve accurate annotation for both POS and syntax. By aligning original and standardized word forms, our method allows to create tailor-made standardization dictionaries for historical Portuguese with optional period or author frequencies.
CITATION STYLE
Bick, E., & Zampieri, M. (2016). Grammatical annotation of historical Portuguese: Generating a corpus-based diachronic dictionary. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9924 LNCS, pp. 3–11). Springer Verlag. https://doi.org/10.1007/978-3-319-45510-5_1
Mendeley helps you to discover research relevant for your work.