Development of a Brazilian Portuguese Hotel’s Reviews Corpus

1Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The provision of voluntary textual information mediated by the Internet, and particularly by Web 2.0, provided an opportunity for the creation of large linguistic corpora. These corpora can serve as a fundamental resource for the development of applications focused on natural language, especially those using deep learning techniques that require big datasets. One type of application that benefits from these resources is the ones that perform sentiment analysis. This article describes the creation of corpus aimed to support sentiment analysis applications. It consists of reviews hotels located in the Brazilian capitals and the Federal District, written in Brazilian Portuguese language. The reviews that make up the corpus have been taken from TripAdvisor and have undergone normalization and POS tagging. The primary goal is to make it available to the community to be used in machine learning tasks geared toward natural language.

Cite

CITATION STYLE

APA

de Souza, J. G. R., de Paiva Oliveira, A., & Moreira, A. (2018). Development of a Brazilian Portuguese Hotel’s Reviews Corpus. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11122 LNAI, pp. 353–361). Springer Verlag. https://doi.org/10.1007/978-3-319-99722-3_36

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free