Part-of-speech Tagging of Code-mixed Social Media Content: Pipeline, Stacking and Joint Modelling

20Citations
Citations of this article
95Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Multilingual users of social media sometimes use multiple languages during conversation. Mixing multiple languages in content is known as code-mixing. We annotate a subset of a trilingual code-mixed corpus (Barman et al., 2014) with part-of-speech (POS) tags. We investigate two state-of-the-art POS tagging techniques for code-mixed content and combine the features of the two systems to build a better POS tagger. Furthermore, we investigate the use of a joint model which performs language identification (LID) and part-of-speech (POS) tagging simultaneously.

Cite

CITATION STYLE

APA

Barman, U., Wagner, J., & Foster, J. (2016). Part-of-speech Tagging of Code-mixed Social Media Content: Pipeline, Stacking and Joint Modelling. In EMNLP 2016 - 2nd Workshop on Computational Approaches to Code Switching, CS 2016 - Proceedings of the Workshop (pp. 30–39). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w16-5804

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free