Skip to content

Entity Based Sentiment Analysis on Twitter

by Siddharth Batra, Deepak Rao
Science ()


The aim of ourwork is to use the Twitter corpus to ascertain the opinion about entities that matter and enable consumption of these opinions in a user friendly way. We focus on classifying the opinions as either positive, negative or neutral. Since there arent large enough datasets of labeled tweets, limiting the sentiment categories to the above three enables us to leverage other similar but larger datasets for training custom sentiment language models. We begin by extracting entities from the Twitter dataset using the Stanford NER 8. URLs and username tags (person) are also treated as entities to augment the entities found by the NER. To learn a sentiment language model we use a corpus of 200,000 product reviews that have been labeled as positive or negative. Using this corpus the sentiment language model computes the prob- ability that a given unigram or bigram is being used in a positive context and the probability that its being used in a negative context. Using this sentiment language model we analyze all tweets associated with an entity and classify whether the overall opinion of that entity is positive or negative and by how much.

Cite this document (BETA)

Readership Statistics

65 Readers on Mendeley
by Discipline
75% Computer Science
6% Business, Management and Accounting
3% Engineering
by Academic Status
28% Student > Ph. D. Student
20% Student > Master
17% Researcher
by Country
9% United States
3% India
3% Spain

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Sign up & Download

Already have an account? Sign in