Improving part-of-speech tagging by meta-learning

1Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Recently, we have observed a rapid progress in the state of Part of Speech tagging for Polish. Thanks to PolEval—a shared task organized in late 2017—many new approaches to this problem have been proposed. New deep learning paradigms have helped to narrow the gap between the accuracy of POS tagging methods for Polish and for English. Still, the number of errors made by the taggers on large corpora is very high, as even the currently best performing tagger reaches an accuracy of ca. 94.5%, which translates to millions of errors in a billion-word corpus. To further improve the accuracy of Polish POS tagging we propose to employ a meta-learning approach on top of several existing taggers. This meta-learning approach is inspired by the fact that the taggers, while often similar in terms of accuracy, make different errors, which leads to a conclusion that some of the methods are better in specific contexts than the others. We thus train a machine learning method that captures the relationship between a particular tagger accuracy and language context and in this way create a model, which makes a selection between several taggers in each context to maximize the expected tagging accuracy.

Cite

CITATION STYLE

APA

Kobyliński, Ł., Wasiluk, M., & Wojdyga, G. (2018). Improving part-of-speech tagging by meta-learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11107 LNAI, pp. 144–152). Springer Verlag. https://doi.org/10.1007/978-3-030-00794-2_15

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free