The presented work aims at generating a systematically annotated corpus that can support the enhancement of senti- ment analysis tasks in Telugu using word- level sentiment annotations. From On- toSenseNet, we extracted 11,000 adjec- tives, 253 adverbs, 8483 verbs and sen- timent annotation is being done by lan- guage experts. We discuss the methodol- ogy followed for the polarity annotations and validate the developed resource. This work aims at developing a benchmark cor- pus, as an extension to SentiWordNet, and baseline accuracy for a model where lex- eme annotations are applied for sentiment predictions. The fundamental aim of this paper is to validate and study the possi- bility of utilizing machine learning algo- rithms, word-level sentiment annotations in the task of automated sentiment identifi- cation. Furthermore, accuracy is improved by annotating the bi-grams extracted from the target corpus.
CITATION STYLE
Parupalli, S., Rao, V. A., & Mamidi, R. (2018). BCSAT : A benchmark corpus for sentiment analysis in telugu using word-level annotations. In ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Student Research Workshop (pp. 99–104). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/p18-3014
Mendeley helps you to discover research relevant for your work.