Stagger: an Open-Source Part of Speech Tagger for Swedish

Robert Östling

Journal ArticleOPEN ACCESS

Stagger: an Open-Source Part of Speech Tagger for Swedish

Östling R

Northern European Journal of Language Technology (2013) 3 1-18

DOI: 10.3384/nejlt.2000-1533.1331

N/ACitations

15Readers

Abstract

This work presents Stagger, a new open-source part of speech tagger for Swedish based on the Averaged Perceptron. By using the SALDO morphological lexicon and semi-supervised learning in the form of Collobert andWeston embeddings, it reaches an accuracy of 96.4% on the standard Stockholm-Umeå Corpus dataset, making it the best single part of speech tagging system reported for Swedish. Accuracy increases to 96.6% on the latest version of the corpus, where the annotation has been revised to increase consistency. Stagger is also evaluated on a new corpus of Swedish blog posts, investigating its out-of-domain performance.

Cite

CITATION STYLE

APA

Östling, R. (2013). Stagger: an Open-Source Part of Speech Tagger for Swedish. Northern European Journal of Language Technology, 3, 1–18. https://doi.org/10.3384/nejlt.2000-1533.1331

Stagger: an Open-Source Part of Speech Tagger for Swedish

Abstract

Cite

Register to see more suggestions