Data exploratory analysis for classification in machine learning algorithms

Jesintha Bala Chandrasekar; Shivakumar Murugesh; Vasudeva Rao Prasadula

Book Chapter

Data exploratory analysis for classification in machine learning algorithms

Springer Science and Business Media Deutschland GmbH, (2021), 113-125

DOI: 10.1007/978-981-15-5258-8_13

4Citations

6Readers

Get full text

Abstract

Availability of big data transformed the way machine learning works and the way data is used in machine learning. In real time the data gathered from various sources might be unstructured, incomplete, unrealistic and incorrect in nature. Transforming the data with the above-mentioned qualities and making it ready for analysis is a challenging task. As the quality of data have direct impact on the efficiency of the trained model, data exploratory analysis (DEA) plays a major role in understanding the data and forms the quality training dataset for the machine learning algorithms. This paper emphasizes the importance of DEA in the selection of the significant attributes and filling of missing values to form the quality training dataset. The dataset considered for experimentation is a binary classification problem “Survival prediction of Titanic Passengers”. Experimental results show that training the model with the quality dataset has improved the accuracy as compared to the case when the model was trained with a raw data.

Author supplied keywords

Cite

CITATION STYLE

APA

Chandrasekar, J. B., Murugesh, S., & Prasadula, V. R. (2021). Data exploratory analysis for classification in machine learning algorithms. In Lecture Notes on Data Engineering and Communications Technologies (Vol. 53, pp. 113–125). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-15-5258-8_13

Data exploratory analysis for classification in machine learning algorithms

Abstract

Author supplied keywords

Cite

Register to see more suggestions