Characterizing user-generated text content mining: A systematic mapping study of the portuguese language

4Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Unstructured data accounts for more than 80% of enterprise data and is growing at an annual exponential rate of 60%. Text mining refers to the process of discovering new, previously unknown and potentially useful information from a variety of unstructured data including user-generated text content (UGTC). Given that Portuguese language is one of the most common languages in the world, and it is also the second most frequent language on Twitter, the goal of this work is to plot the landscape of current studies that relates the application of text mining to UGTC in the Portuguese language. The systematic mapping review method was applied to search, select, and to extract data from the included studies. Our manual and automated searches retrieved 6075 studies up to year 2014, from which 35 were included in the study. Text classification concentrates 79% of all text mining tasks, having the Naïve Bayes as the main classifier and Twitter as the main data source.

Cite

CITATION STYLE

APA

Souza, E., Castro, D., Vitório, D., Teles, I., Oliveira, A. L. I., & Gusmão, C. (2016). Characterizing user-generated text content mining: A systematic mapping study of the portuguese language. In Advances in Intelligent Systems and Computing (Vol. 444, pp. 1015–1024). Springer Verlag. https://doi.org/10.1007/978-3-319-31232-3_96

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free