Stochastic approaches to tagging of Polish brought results far from being satisfactory. However, successful combination of hand-written rules and a stochastic approach to Czech, as well, as some initial experiments in acquisition of tagging rules for Polish revealed potential capabilities of a rule based approach. The goals are: to define a language of tagging constraints, to construct a set of reduction rules for Polish and to apply Machine Learning to extraction of tagging rules. A language of functional tagging constraints called JOSKIPI is proposed. An extension to the C4.5 algorithm based on introducing complex JOSKIPI operators into decision trees is presented. Construction of a preliminary hand-written tagging rules for Polish is discussed. Finally, the results of the comparison of different versions of the tagger are given. © Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Piasecki, M. (2006). Hand-written and automatically extracted rules for Polish tagger. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4188 LNCS, pp. 205–212). Springer Verlag. https://doi.org/10.1007/11846406_26
Mendeley helps you to discover research relevant for your work.