Predicting sentiment of Polish language short texts

4Citations
Citations of this article
67Readers
Mendeley users who have this article in their library.

Abstract

The goal of this paper is to use all available Polish language data sets to seek the best possible performance in supervised sentiment analysis of short texts. We use text collections with labeled sentiment such as tweets, movie reviews and a sentiment treebank, in three comparison modes. In the first, we examine the performance of models trained and tested on the same text collection using standard cross-validation (in-domain). In the second we train models on all available data except the given test collection, which we use for testing (one vs rest cross-domain). In the third, we train a model on one data set and apply it to another one (one vs one cross-domain). We compare wide range of methods including machine learning on bag-of-words representation, bidirectional recurrent neural networks as well as the most recent pre-trained architectures ELMO and BERT. We formulate conclusions as to cross-domain and in-domain performance of each method. Unsurprisingly, BERT turned out to be a strong performer, especially in the cross-domain setting. What is surprising however, is solid performance of the relatively simple multinomial Naive Bayes classifier, which performed equally well as BERT on several data sets.

References Powered by Scopus

Universal language model fine-tuning for text classification

1886Citations
N/AReaders
Get full text

Improved semantic representations from tree-structured long short-Term memory networks

1654Citations
N/AReaders
Get full text

Towards better UD parsing: Deep contextualized word embeddings, ensemble, and treebank concatenation

146Citations
N/AReaders
Get full text

Cited by Powered by Scopus

AspectEmo: Multi-Domain Corpus of Consumer Reviews for Aspect-Based Sentiment Analysis

11Citations
N/AReaders
Get full text

A dataset for sentiment analysis of entities in news headlines (SEN)

8Citations
N/AReaders
Get full text

Transferring Sentiment Cross-Lingually within and across Same-Family Languages

0Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Wawer, A., & Sobiczewska, J. (2019). Predicting sentiment of Polish language short texts. In International Conference Recent Advances in Natural Language Processing, RANLP (Vol. 2019-September, pp. 1321–1327). Incoma Ltd. https://doi.org/10.26615/978-954-452-056-4_151

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 14

58%

Researcher 6

25%

Lecturer / Post doc 3

13%

Professor / Associate Prof. 1

4%

Readers' Discipline

Tooltip

Computer Science 21

70%

Linguistics 5

17%

Engineering 3

10%

Nursing and Health Professions 1

3%

Save time finding and organizing research with Mendeley

Sign up for free