Large-scale experiments with NP chunking of Polish

9Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The published experiments with shallow parsing for Slavic languages are characterised with small size of the corpora used. With the publication of the National Corpus of Polish (NCP), a new opportunity was opened: to test several chunking algorithms on the 1-million token manually annotated subcorpus of the NCP. We test three Machine Learning techniques: Decision Tree induction, Memory-Based Learning and Conditional Random Fields. We also investigate the influence of tagging errors on the overall chunker performance, which happens to be quite substantial. © 2012 Springer-Verlag.

Cite

CITATION STYLE

APA

Radziszewski, A., & Pawlaczek, A. (2012). Large-scale experiments with NP chunking of Polish. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7499 LNAI, pp. 143–149). https://doi.org/10.1007/978-3-642-32790-2_17

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free