Predicting sentiment of Polish language short texts

Aleksander Wawer; Julita Sobiczewska

Conference ProceedingsOPEN ACCESS

Predicting sentiment of Polish language short texts

International Conference Recent Advances in Natural Language Processing, RANLP (2019) 2019-September 1321-1327

DOI: 10.26615/978-954-452-056-4_151

4Citations

67Readers

Abstract

The goal of this paper is to use all available Polish language data sets to seek the best possible performance in supervised sentiment analysis of short texts. We use text collections with labeled sentiment such as tweets, movie reviews and a sentiment treebank, in three comparison modes. In the first, we examine the performance of models trained and tested on the same text collection using standard cross-validation (in-domain). In the second we train models on all available data except the given test collection, which we use for testing (one vs rest cross-domain). In the third, we train a model on one data set and apply it to another one (one vs one cross-domain). We compare wide range of methods including machine learning on bag-of-words representation, bidirectional recurrent neural networks as well as the most recent pre-trained architectures ELMO and BERT. We formulate conclusions as to cross-domain and in-domain performance of each method. Unsurprisingly, BERT turned out to be a strong performer, especially in the cross-domain setting. What is surprising however, is solid performance of the relatively simple multinomial Naive Bayes classifier, which performed equally well as BERT on several data sets.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Wawer, A., & Sobiczewska, J. (2019). Predicting sentiment of Polish language short texts. In International Conference Recent Advances in Natural Language Processing, RANLP (Vol. 2019-September, pp. 1321–1327). Incoma Ltd. https://doi.org/10.26615/978-954-452-056-4_151

Readers' Seniority

PhD / Post grad / Masters / Doc 14

58%

Researcher 6

25%

Lecturer / Post doc 3

13%

Professor / Associate Prof. 1

Readers' Discipline

Computer Science 21

70%

Linguistics 5

17%

Engineering 3

10%

Nursing and Health Professions 1

Predicting sentiment of Polish language short texts

Abstract

References Powered by Scopus

Universal language model fine-tuning for text classification

Improved semantic representations from tree-structured long short-Term memory networks

Towards better UD parsing: Deep contextualized word embeddings, ensemble, and treebank concatenation

Cited by Powered by Scopus

AspectEmo: Multi-Domain Corpus of Consumer Reviews for Aspect-Based Sentiment Analysis

A dataset for sentiment analysis of entities in news headlines (SEN)

Transferring Sentiment Cross-Lingually within and across Same-Family Languages

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline