Abstract
Multilingual users of social media sometimes use multiple languages during conversation. Mixing multiple languages in content is known as code-mixing. We annotate a subset of a trilingual code-mixed corpus (Barman et al., 2014) with part-of-speech (POS) tags. We investigate two state-of-the-art POS tagging techniques for code-mixed content and combine the features of the two systems to build a better POS tagger. Furthermore, we investigate the use of a joint model which performs language identification (LID) and part-of-speech (POS) tagging simultaneously.
Cite
CITATION STYLE
Barman, U., Wagner, J., & Foster, J. (2016). Part-of-speech Tagging of Code-mixed Social Media Content: Pipeline, Stacking and Joint Modelling. In EMNLP 2016 - 2nd Workshop on Computational Approaches to Code Switching, CS 2016 - Proceedings of the Workshop (pp. 30–39). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w16-5804
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.