POS tagging of Hungarian with combined statistical and rule-based methods

András Kuba; András Hócza; János Csirik

Conference Proceedings

POS tagging of Hungarian with combined statistical and rule-based methods

Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (2004) 3206 113-120

DOI: 10.1007/978-3-540-30120-2_15

12Citations

2Readers

Get full text

Abstract

In this paper we will survey the key results achieved so far in Hungarian POS tagging. The most successful approaches have been selected and re-evaluated on a manually annotated corpus containing 1.2 million words. Tests were performed on single-domain, multiple domain and cross-domain test settings. We investigate here the possibilities of further improvement of the selected POS tagging methods by combining them. Our aim is to build a POS tagger that achieves good results on a fine tag set of more than 1000 tags. Results show that rule-based methods - including Transformation Based Learning -can be used as effectively as statistical methods for Hungarian POS tagging. Combined methods do increase the tagging accuracy, producing significantly better results than those published earlier. We also show that the optimal combination differs in the cases of domain specific and general purpose taggers. © Springer-Verlag Berlin Heidelberg 2004.

Cite

CITATION STYLE

APA

Kuba, A., Hócza, A., & Csirik, J. (2004). POS tagging of Hungarian with combined statistical and rule-based methods. In Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 3206, pp. 113–120). Springer Verlag. https://doi.org/10.1007/978-3-540-30120-2_15

POS tagging of Hungarian with combined statistical and rule-based methods

Abstract

Cite

Register to see more suggestions