Fine-grained POS tagging of german social media and web texts

0Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

This paper presents work on part-of-speech tagging of German social media and web texts. We take a simple Hidden Markov Model based tagger as a starting point, and extend it with a distributional approach to estimating lexical (emission) probabilities of out-of-vocabulary words, which occur frequently in social media and web texts and are a major reason for the low performance of off-the-shelf taggers on these types of text. We evaluate our approach on the recent EmpiriST 2015 shared task dataset and show that our approach improves accuracy on out-of-vocabulary tokens by up to 5.8%; overall, we improve state-of-the-art by 0.4% to 90.9% accuracy.

Cite

CITATION STYLE

APA

Thater, S. (2018). Fine-grained POS tagging of german social media and web texts. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10713 LNAI, pp. 72–80). Springer Verlag. https://doi.org/10.1007/978-3-319-73706-5_7

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free