In this paper we will survey the key results achieved so far in Hungarian POS tagging. The most successful approaches have been selected and re-evaluated on a manually annotated corpus containing 1.2 million words. Tests were performed on single-domain, multiple domain and cross-domain test settings. We investigate here the possibilities of further improvement of the selected POS tagging methods by combining them. Our aim is to build a POS tagger that achieves good results on a fine tag set of more than 1000 tags. Results show that rule-based methods - including Transformation Based Learning -can be used as effectively as statistical methods for Hungarian POS tagging. Combined methods do increase the tagging accuracy, producing significantly better results than those published earlier. We also show that the optimal combination differs in the cases of domain specific and general purpose taggers. © Springer-Verlag Berlin Heidelberg 2004.
CITATION STYLE
Kuba, A., Hócza, A., & Csirik, J. (2004). POS tagging of Hungarian with combined statistical and rule-based methods. In Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 3206, pp. 113–120). Springer Verlag. https://doi.org/10.1007/978-3-540-30120-2_15
Mendeley helps you to discover research relevant for your work.