Stagger: an Open-Source Part of Speech Tagger for Swedish

  • Östling R
N/ACitations
Citations of this article
15Readers
Mendeley users who have this article in their library.

Abstract

This work presents Stagger, a new open-source part of speech tagger for Swedish based on the Averaged Perceptron. By using the SALDO morphological lexicon and semi-supervised learning in the form of Collobert andWeston embeddings, it reaches an accuracy of 96.4% on the standard Stockholm-Umeå Corpus dataset, making it the best single part of speech tagging system reported for Swedish. Accuracy increases to 96.6% on the latest version of the corpus, where the annotation has been revised to increase consistency. Stagger is also evaluated on a new corpus of Swedish blog posts, investigating its out-of-domain performance.

Cite

CITATION STYLE

APA

Östling, R. (2013). Stagger: an Open-Source Part of Speech Tagger for Swedish. Northern European Journal of Language Technology, 3, 1–18. https://doi.org/10.3384/nejlt.2000-1533.1331

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free