A major goal of Exploratory Data Analysis (EDA) is to understand main characteristics of a dataset, especially relationships between variables, which are helpful for creating a predictive model and analysing causality in social science research. This paper aims to introduce Maximal Information Coefficient (MIC) and its by-product statistics to social science researchers as effective EDA tools for big social data. A case study was conducted using a historical data of more than 3,000 country-level indicators. As a result, MIC and some by-product statistics successfully provided useful information for EDA complementing the traditional Pearson’s correlation. Moreover, they revealed several significant, including nonlinear, relationships between variables which are intriguing and able to suggest further research in social sciences.
CITATION STYLE
Lertvittayakumjorn, P., Wu, C., Liu, Y., Mi, H., & Guo, Y. (2017). Exploratory analysis of big social data using MIC/MINE statistics. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10540 LNCS, pp. 513–526). Springer Verlag. https://doi.org/10.1007/978-3-319-67256-4_41
Mendeley helps you to discover research relevant for your work.