Dimensionality reduction and the strange case of categorical data for predicting defective water meter devices

Marco Roccetti; Luca Casini; Giovanni Delnevo; Simone Bonfante

Conference Proceedings

Dimensionality reduction and the strange case of categorical data for predicting defective water meter devices

Advances in Intelligent Systems and Computing (2021) 1253 AISC 155-159

DOI: 10.1007/978-3-030-55307-4_24

1Citations

4Readers

Get full text

Abstract

Further to an experiment conducted with a deep learning (DL) model, tailored to predict whether a water meter device would fail with passage of time, we came across a very strange case, occurring when we tried to strengthen the training activity of our classifier by using, besides the numerical measurements of consumed water, also other contextual available information, of categorical type. Surprisingly, that further categorical information did not improve the prediction accuracy, which instead fell down, sensibly. Recognized the problem as a case of an excessive increase of the dimensions of the space of data under observation, with a correspondent loss of statistical significance, we changed the training strategy. Observing that every categorical variable followed a quasi-Pareto distribution, we re-trained our DL models, for each single categorical variable, only on that fraction of meter devices (and corresponding measurements of consumed water) that exhibited the most frequent qualitative values for that categorical variable. This new strategy yielded a prediction accuracy level never reached before, amounting to a value of 87–88% on average.

Author supplied keywords

Cite

CITATION STYLE

APA

Roccetti, M., Casini, L., Delnevo, G., & Bonfante, S. (2021). Dimensionality reduction and the strange case of categorical data for predicting defective water meter devices. In Advances in Intelligent Systems and Computing (Vol. 1253 AISC, pp. 155–159). Springer. https://doi.org/10.1007/978-3-030-55307-4_24

Dimensionality reduction and the strange case of categorical data for predicting defective water meter devices

Abstract

Author supplied keywords

Cite

Register to see more suggestions