It was not until 2010 when businesses, politicians and people in general began to realise the potential of Twitter in Spain. This fact has awoken research interest in the extraction of knowledge from Twitter. This paper aims to fill the gap of the lack of resources for Twitter sentiment analysis in Spanish by performing a study of different features and machine learning algorithms for classifying the polarity of Twitter posts. The result is a new corpus of Spanish tweets called COST, and we have carried out a wide-ranging experimentation in which different machine learning algorithms have been used. Furthermore, we have tested the influence of using different weighting schemes for unigrams, the influence of eliminating stop-words and the application of a stemmer process.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below