It was not until 2010 when businesses, politicians and people in general began to realize the potential of Twitter in Spain. This fact has awoken research interest in the extraction of knowledge from Twitter. This paper aims to fill the gap of the lack of resources for Twitter sentiment analysis in Spanish by performing a study of different features and machine learning algorithms for classifying the polarity of Twitter posts. The result is a new corpus of Spanish tweets called COST, and we have carried out a wide-ranging experiment in which different machine learning algorithms have been used. Furthermore, we have tested the influence of using different weighting schemes for unigrams, the influence of eliminating stop-words and the application of a stemmer process.
CITATION STYLE
Martínez-Cámara, E., Martín-Valdivia, M. T., Ureña-López, L. A., & Mitkov, R. (2015). Polarity classification for Spanish tweets using the COST corpus. Journal of Information Science, 41(3), 263–272. https://doi.org/10.1177/0165551514566564
Mendeley helps you to discover research relevant for your work.