Abstract
Twitter is an important source of information but it is challenging to analyze this data in order to recover meaningful inference. The present paper uses topic modelling and sentiment analysis to draw useful context from Twitter data set related to ‘Clean India Mission’. Latent Dirichlet Allocation is used in the research to identify twenty most trending topics and top seven terms related to each of the twenty topics. Coherence and prevalence values represent model efficiency. Topic clustering is also used in the research to identify how strongly topics are related to each other. Five different clusters are created from the top trending topics reflecting different aspects in the corpus. The average silhouette width is employed to determine the optimal number of clusters. Lexicon based classification using ‘nrc’ sentiment directory is also used to reflect people’s sentiment at ten different sentiment levels for the mission. Twitter data for the research is collected from seven different Hashtags, including the official page of the clean India campaign. The most relevant subject segments are identified after evaluating the trending topics by utilizing topic coherence value.
Author supplied keywords
Cite
CITATION STYLE
Rani, S., Gill, N. S., & Gulia, P. (2021). Sentiment analysis and topic modelling on twitter for clean india mission. Indian Journal of Computer Science and Engineering, 12(5), 1198–1207. https://doi.org/10.21817/INDJCSE/2021/V12I5/211205018
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.