An important sub-task of sentiment analysis is polarity classification, in which text is classified as being positive or negative. Supervised machine learning techniques can perform this task very effectively. However, they require a large corpus of training data, and a number of studies have demonstrated that the good performance of supervised models is dependent on a good match between the training and testing data with respect to the domain, topic and time-period. Weakly-supervised techniques use a large collection of unlabelled text to determine sentiment, and so their performance may be less dependent on the domain, topic and time-period represented by the testing data. This paper presents experiments that investigate the effectiveness of word similarity techniques when performing weakly-supervised sentiment classification. It also considers the extent to which the performance of each method is independent from the domain, topic and time-period of the testing data. The results indicate that the word similarity techniques are suitable for applications that require sentiment classification across several domains.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below