A comparative evaluation of statistical part-of-speech taggers for Russian

Rinat Gareev; Vladimir Ivanov

Conference Proceedings

A comparative evaluation of statistical part-of-speech taggers for Russian

Communications in Computer and Information Science (2015) 505 263-275

DOI: 10.1007/978-3-319-25485-2_8

0Citations

3Readers

Get full text

Abstract

Part-of-speech (POS) tagging is an essential step in many text processing applications. Quite a few works focus on solving this task for Russian; their results are not directly comparable due to the lack of shared datasets and tools. We propose a POS tagging evaluation framework for Russian that comprises existing third-party resources available for researchers. We applied the framework to compare several implementations of statistical classifiers: HunPos, Stanford POS tagger, OpenNLP implementation of MaxEnt Markov Model, and our own reimplementation of Tiered Conditional Random Fields. The best tagger that was trained on a corpus with less than one million words achieved an accuracy above 93% .We expect that the evaluation framework will facilitate future studies and improvements on POS tagging for Russian.

Cite

CITATION STYLE

APA

Gareev, R., & Ivanov, V. (2015). A comparative evaluation of statistical part-of-speech taggers for Russian. In Communications in Computer and Information Science (Vol. 505, pp. 263–275). Springer Verlag. https://doi.org/10.1007/978-3-319-25485-2_8

A comparative evaluation of statistical part-of-speech taggers for Russian

Abstract

Cite

Register to see more suggestions