Large-scale experiments with NP chunking of Polish

Adam Radziszewski; Adam Pawlaczek

Conference Proceedings

Large-scale experiments with NP chunking of Polish

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7499 LNAI 143-149

DOI: 10.1007/978-3-642-32790-2_17

9Citations

4Readers

Get full text

Abstract

The published experiments with shallow parsing for Slavic languages are characterised with small size of the corpora used. With the publication of the National Corpus of Polish (NCP), a new opportunity was opened: to test several chunking algorithms on the 1-million token manually annotated subcorpus of the NCP. We test three Machine Learning techniques: Decision Tree induction, Memory-Based Learning and Conditional Random Fields. We also investigate the influence of tagging errors on the overall chunker performance, which happens to be quite substantial. © 2012 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Radziszewski, A., & Pawlaczek, A. (2012). Large-scale experiments with NP chunking of Polish. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7499 LNAI, pp. 143–149). https://doi.org/10.1007/978-3-642-32790-2_17

Large-scale experiments with NP chunking of Polish

Abstract

Author supplied keywords

Cite

Register to see more suggestions