In the era of Fourth Industrial Revolution, data and information became a valuable resource. In this data-driven economy, it is extremely important to maintain high level of data quality. Poor data quality can be a significant business cost and therefore data quality and consistency is a primary challenge for contemporary enterprises. There exists the need for concrete understanding of data quality which is the key concern among the data users. Hence, the study was carried out with the objective to analyse Twitter data to extract sentiments and opinions in unstructured texts and the key topics that are under consideration of Twitter users. Further, Text classification and topic modelling techniques have been performed to identify positive and negative sentiments and the key themes represented in polarized texts referring to data quality. In this study, two-step processes were followed to achieve the objective. In the first step, positive and negative sentiments were identified from Twitter feeds. In the second step, the Latent Dirichlet Allocation method was performed that allows to discover the keywords in the text corpuses that capture the recurring themes and is widely used to analyse large sets of polarized texts to identify the most common topics quickly and efficiently. The study contributes to text mining literature by providing a framework for analysing public sentiments. This can help to understand the key themes in negative sentiments related to data quality among the machine learning practitioners. Also, key concerns of public/data users could be highlighted and shared with larger community.
CITATION STYLE
Dwivedi, D. N., Wójcik, K., & Vemareddyb, A. (2022). Identification of Key Concerns and Sentiments Towards Data Quality and Data Strategy Challenges Using Sentiment Analysis and Topic Modeling. In Studies in Classification, Data Analysis, and Knowledge Organization (pp. 19–29). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-10190-8_2
Mendeley helps you to discover research relevant for your work.