In today's world the data plays an indispensable role. The proper understanding of data and its interpretation lays the foundation for the growth and also the success of company or an organization. As in domains such as business, finance and banking, health sector also produces huge amounts of data. This data needs to be properly analyzed and summarized before the data is modeled for a specific purpose. Generally, clinical data involves stakeholders like doctors, technicians, lab analysts, hospital managers, care providers and insurance agents. Exploratory Data Analysis plays an important role in providing the complete picture of the dataset along with identifying new insights and hidden patterns in the data. As such it becomes the most significant step before actually preprocessing the data. In our paper we have implemented EDA on Statlog heart disease dataset to identify the important variables, correlations between any variables, missing values, outliers and PCA. To verify, whether the process of EDA actually impacts the performance we have utilized machine learning algorithms like Naïve Bayes, Logistic regression, Decision Tree, Support Vector Machine, Random forest. Results indicate that the performance of the prediction model considerably increases after performing EDA regardless of the type of prediction algorithm used. Also the analysis of the dataset with graphical results helps the stakeholders to make better decisions regarding their patients and their treatments. Understanding any clinical data before modeling would prevent erroneous models later and exploratory analysis helps in achieving it.
CITATION STYLE
Mrudula*, O., & Sowjanya, A. M. (2020). Understanding Clinical Data using Exploratory Analysis. International Journal of Recent Technology and Engineering (IJRTE), 8(5), 5434–5437. https://doi.org/10.35940/ijrte.e6827.018520
Mendeley helps you to discover research relevant for your work.