Semi-cupervised Sentiment Analysis of Portuguese Tweets with Random Walk in Feature Sample Networks

1Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Nowadays, a huge amount of data is generated daily around the world and many machine learning tasks require labeled data, which sometimes is not available. Manual labeling such amount of data may consume a lot of time and resources. One way to overcome this limitation is to learn from both labeled and unlabeled data, which is known as semi-supervised learning. In this paper, we use a positive-unlabeled (PU) learning technique called Random Walk in Feature-Sample Networks (RWFSN) to perform semi-supervised sentiment analysis, which is an important machine learning that can be achieved by classifying the polarity of texts, in Brazilian Portuguese tweets. Although RWFSN reaches excellent performance in many PU learning problems, it has two major limitations when applied in our problem: it assumes that samples are long texts (many features) and that the class prior probabilities are known. We leverage the technique by augmenting the data representation in the feature space and by adding a validation set to better estimate the class priors. As a result, we identified unlabeled samples of the positive class with precision around at 70% in higher labeled ratio, but with high standard deviation, showing the impact of data variance in results. Moreover, given the properties of the RWFSN method, we provide interpretability of the results by pointing out the most relevant features of the task.

Cite

CITATION STYLE

APA

Gengo, P., & Verri, F. A. N. (2020). Semi-cupervised Sentiment Analysis of Portuguese Tweets with Random Walk in Feature Sample Networks. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12319 LNAI, pp. 595–605). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-61377-8_42

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free