Effective architecture of the Polish tagger

Maciej Piasecki; Grzegorz Godlewski

Conference Proceedings

Effective architecture of the Polish tagger

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2006) 4188 LNCS 213-220

DOI: 10.1007/11846406_27

9Citations

2Readers

Get full text

Abstract

The large tagset of the IPI PAN Corpus of Polish and the limited size of the learning corpus make construction of a tagger especially demanding. The goal of this work is to decompose the overall process of tagging of Polish into subproblems of partial disambiguation. Moreover, an architecture of a tagger facilitating this decomposition is proposed. The proposed architecture enables easy integration of hand-written tagging rules with the rest of the tagger. The architecture is open for different types of classifiers. A complete tagger for Polish called TaKIPI is also presented. Its configuration, the achieved results (92.55% of accuracy for all tokens, 84.75% for ambiguous tokens in ten-fold test), and considered variants of the architecture are discussed, too. © Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Piasecki, M., & Godlewski, G. (2006). Effective architecture of the Polish tagger. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4188 LNCS, pp. 213–220). Springer Verlag. https://doi.org/10.1007/11846406_27

Effective architecture of the Polish tagger

Abstract

Cite

Register to see more suggestions