Evaluating machine learning classification using sorted missing percentage technique based on missing data

9Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.

Abstract

Missing data are common in industrial sensor readings owing to system updates and unequal radio-frequency periods. Existing methods addressing missing data through imputation may not always be appropriate. This study presented a sorted missing percentages technique for filtering attributes when building machine learning classification models using sensor readings with missing data. Signal detection theory was employed to evaluate the distinguishing ability of resulting models. To evaluate its performance, the proposed technique was applied to a publicly available air pressure system dataset, which then was used to build several classifiers. The experimental results indicated that the proposed technique allowed a logistic regression model to achieve the best accuracy score (99.56%) and a better distinguishing ability (response bias of 0.0013, adjusted response bias of 0.0044, and decision criterion of -1.8994) compared with the methods applied to the same dataset and reported in papers published between 2016 and 2019 March on binary classification, wherein attributes with more than 20% of missing data were filtered out. The proposed technique is suitable for industrial sensor data analysis and can be applied to the scenarios dealing with missing data owing to unequal radio-frequency periods or a system being updated with new fields.

Cite

CITATION STYLE

APA

Hung, C. Y., Jiang, B. C., & Wang, C. C. (2020). Evaluating machine learning classification using sorted missing percentage technique based on missing data. Applied Sciences (Switzerland), 10(14). https://doi.org/10.3390/app10144920

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free