Variable transformation for granularity change in hierarchical databases in actual data mining solutions

0Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper presents a variable transformation strategy for enriching the variables’ information content and defining the project target in actual data mining applications based on relational databases with data at different grains. In an actual solution for assessing the schools’ quality based on official school survey and students tests data, variables at the student and teachers’ grains had to become features of the schools they belonged. The formal problem was how to summarize the relevant information content of the attribute distributions in a few summarizing concepts (features). Instead of the typical lowest order distribution momenta, the proposed transformations based on the distribution histogram produced a weighted score for the input variables. Following the CRISP-DM method, the problem interpretation has been precisely defined as a binary decision problem on a granularly transformed student grade. The proposed granular transformation embedded additional human expert’s knowledge to the input variables at the school level. Logistic regression produced a classification score for good schools and the AUC_ROC and Max_KS assessed that score performance on statistically independent datasets. A 10-fold cross-validation experimental procedure showed that this domain-driven data mining approach produced statistically significant improvement at a 0.99 confidence level over the usual distribution central tendency approach.

Cite

CITATION STYLE

APA

Adeodato, P. J. L. (2015). Variable transformation for granularity change in hierarchical databases in actual data mining solutions. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9375 LNCS, pp. 146–155). Springer Verlag. https://doi.org/10.1007/978-3-319-24834-9_18

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free