Efficient pre-processing and feature selection for clustering of cancer tweets

6Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The impact of social media in our daily life cannot be overlooked. Harnessing this rich and varied data for information is a challenging job for the data analysts. As each type of data from social media is unstructured, these data have to be processed, represented and then analysed in different ways suitable to our requirements. Though retail industry and political people are using social media to a great extent to gather feedback and market their new ideas, its significance in other fields related to public like health care and security is not dealt with effectively. Though the information coming from social media may be informal, it contains genuine opinions and experiences which are very much necessary to improve the healthcare service. This work explores analysing the Twitter data related to the most dreaded disease ‘cancer’. We have collected over one million tweets related to various types of cancer and summarized the same to a bunch of representative tweets which may give key inputs to healthcare professionals regarding symptoms, diagnosis, treatment and recovery related to cancer. This, when correlated with clinical research and inputs, may provide rich information to provide a holistic treatment to the patients. We have proposed additional pre-processing to the raw data. We have also explored a combination of feature selection methods, two feature extraction methods and a soft clustering algorithm to study the feasibility of the same for our data. The results have proved our intuition right about underlying information and also show that there is a tremendous scope for further research in the area.

Cite

CITATION STYLE

APA

Lavanya, P. G., Kouser, K., & Suresha, M. (2020). Efficient pre-processing and feature selection for clustering of cancer tweets. In Advances in Intelligent Systems and Computing (Vol. 910, pp. 17–37). Springer Verlag. https://doi.org/10.1007/978-981-13-6095-4_2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free