Part-of-Speech Tagging for code mixed English-Telugu social media data

Kovida Nelakuditi; Divya Sai Jitta; Radhika Mamidi

Conference Proceedings

Part-of-Speech Tagging for code mixed English-Telugu social media data

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 9623 LNCS 332-342

DOI: 10.1007/978-3-319-75477-2_23

2Citations

9Readers

Get full text

Abstract

Part-of-Speech Tagging is a primary and an important step for many Natural Language Processing Applications. POS taggers have reported high accuracies on grammatically correct monolingual data. This paper reports work on annotating code mixed English-Telugu data collected from social media site Facebook and creating automatic POS Taggers for this corpus. POS tagging is considered as a classification problem and we use different classifiers like Linear SVMs, CRFs, Multinomial Bayes with different combinations of features which capture both context of the word and its internal structure. We also report our work on experimenting with combining monolingual POS taggers for POS tagging of this code mixed English-Telugu data.

Author supplied keywords

Cite

CITATION STYLE

APA

Nelakuditi, K., Jitta, D. S., & Mamidi, R. (2018). Part-of-Speech Tagging for code mixed English-Telugu social media data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9623 LNCS, pp. 332–342). Springer Verlag. https://doi.org/10.1007/978-3-319-75477-2_23

Part-of-Speech Tagging for code mixed English-Telugu social media data

Abstract

Author supplied keywords

Cite

Register to see more suggestions