Automatic detection of political opinions in Tweets
Abstract
In this paper, we discuss a variety of issues related to opin- ion mining from microposts, and the challenges they impose on an NLP system, along with an example application we have developed to deter- mine political leanings from a set of pre-election tweets. While there are a number of sentiment analysis tools available which summarise posi- tive, negative and neutral tweets about a given keyword or topic, these tools generally produce poor results, and operate in a fairly simplistic way, using only the presence of certain positive and negative adjectives as indicators, or simple learning techniques which do not work well on short microposts. On the other hand, intelligent tools which work well on movie and customer reviews cannot be used on microposts due to their brevity and lack of context. Our methods make use of a variety of sophisticated NLP techniques in order to extract more meaningful and higher quality opinions, and incorporate extra-linguistic contextual information.
Author-supplied keywords
Automatic detection of political opinions in Tweets
Tweets
Diana Maynard and Adam Funk
Department of Computer Science, University of Sheeld
Regent Court, 211 Portobello Street, Sheeld, UK
diana@dcs.shef.ac.uk
Abstract. In this paper, we discuss a variety of issues related to opin-
ion mining from microposts, and the challenges they impose on an NLP
system, along with an example application we have developed to deter-
mine political leanings from a set of pre-election tweets. While there are
a number of sentiment analysis tools available which summarise posi-
tive, negative and neutral tweets about a given keyword or topic, these
tools generally produce poor results, and operate in a fairly simplistic
way, using only the presence of certain positive and negative adjectives
as indicators, or simple learning techniques which do not work well on
short microposts. On the other hand, intelligent tools which work well
on movie and customer reviews cannot be used on microposts due to
their brevity and lack of context. Our methods make use of a variety
of sophisticated NLP techniques in order to extract more meaningful
and higher quality opinions, and incorporate extra-linguistic contextual
information.
Key words: NLP, opinion mining, social media analysis
1 Introduction
Social media provides a wealth of information about a user's behaviour and in-
terests, from the explicit \John's interests are tennis, swimming and classical
music", to the implicit \people who like skydiving tend to be big risk-takers",
to the associative \people who buy Nike products also tend to buy Apple prod-
ucts". While information about individuals is not always useful on its own, nd-
ing dened clusters of interests and opinions can be interesting. For example,
if many people talk on social media sites about fears in airline security, life in-
surance companies might consider opportunities to sell a new service. This kind
of predictive analysis is all about understanding one's potential audience at a
much deeper level, which can lead to improved advertising techniques, such as
personalised advertisements to dierent groups.
It is in the interests of large public knowledge institutions to be able to collect
and retrieve all the information related to certain events and their development
over time. In this new information age, where thoughts and opinions are shared
through social networks, it is vital that, in order to make best use of this infor-
mation, we can distinguish what is important, and be able to preserve it, in order
to provide better understanding and a better snapshot of particular situations.
Online social networks can also trigger a chain of reactions to such situations and
events which ultimately lead to administrative, political and societal changes.
In this paper, we discuss a variety of issues related to opinion mining from
microposts, and the challenges they impose on a Natural Language Processing
(NLP) system, along with an example application we have developed to divulge
political leanings from a set of pre-election tweets. While knowing that Bob
Smith is a Labour supporter is not particularly interesting on its own, when this
information is combined with other metadata, and information about various
groups of people is combined and analysed, we can begin to get some very useful
insights about political leanings and on factors that impact this, such as debates
aired on television or political incidents that occur.
We rst give in Section 2 some examples of previous work on opinion mining
and sentiment analysis, and show why these techniques are either not suitable
for microposts, or do not work particularly well when adapted to other domains
or when generalised. We then describe the opinion mining process in general
(Section 3), the corpus of political tweets we have developed (Section 4), and
the application to analyse opinions (Section 5). Finally, we give details of a
rst evaluation of the application and some discussion about future directions
(Sections 6 and 7).
2 Related Work
Sentiment detection has been applied to a variety of dierent media, typically to
reviews of products or services, though it is not limited to these. Boiy and Moens
[1], for example, see sentiment detection as a classication problem and apply
dierent feature selections to multilingual collections of digital content including
blog entries, reviews and forum postings. Conclusive measures of bias in such
content have been elusive, but progress towards obtaining reliable measures of
sentiment in text has been made { mapping onto a linear scale related to positive
versus negative, emotional versus neutral language, etc.
Sentiment detection techniques can be roughly divided into lexicon-based
methods [2] and machine-learning methods [1]. Lexicon-based methods rely on
a sentiment lexicon, a collection of known and pre-compiled sentiment terms. A
document's polarity is the ratio of positive to negative terms. Machine learning
approaches make use of syntactic and/or linguistic features, including sentiment
lexicons. Hybrid approaches are very common, and sentiment lexicons play a
key role in the majority of methods. However, such approaches are often in-
exible regarding the ambiguity of sentiment terms. The context in which a
term is used can change its meaning, which is particularly true for adjectives in
sentiment lexicons [3]. Several evaluations have shown that sentiment detection
methods should not neglect contextual information [4, 5], and have identied con-
text words with a high impact on the polarity of ambiguous terms [6]. Besides
the ambiguity of human language, another bottleneck for sentiment detection
methods is the time-consuming creation of sentiment dictionaries. One solution
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime




