Bottom-up variable selection in cluster analysis using bootstrapping: A proposal

Hans Joachim Mucha; Hans Georg Bartel

Conference Proceedings

Bottom-up variable selection in cluster analysis using bootstrapping: A proposal

Studies in Classification, Data Analysis, and Knowledge Organization (2016) 125-135

DOI: 10.1007/978-3-319-25226-1_11

0Citations

1Readers

Get full text

Abstract

Variable selection is a problem of increasing interest in many areas of multivariate statistics such as classification, clustering and regression. In contradiction to supervised classification, variable selection in cluster analysis is a much more difficult problem because usually nothing is known about the true class structure. In addition, in clustering, variable selection is highly related to the main problem of the determination of the number of clusters K to be inherent in the data. Here we present a very general bottom-up approach to variable selection in clustering starting with univariate investigations of stability. The hope is that the structure of interest may be contained in only a small subset of variables. Very general means, we make only use of non-parametric resampling techniques for purposes of validation, where we are looking for clusters that can be reproduced to a high degree under resampling schemes. So, our proposed technique can be applied to almost any cluster analysis method.

Cite

CITATION STYLE

APA

Mucha, H. J., & Bartel, H. G. (2016). Bottom-up variable selection in cluster analysis using bootstrapping: A proposal. In Studies in Classification, Data Analysis, and Knowledge Organization (pp. 125–135). Kluwer Academic Publishers. https://doi.org/10.1007/978-3-319-25226-1_11

Bottom-up variable selection in cluster analysis using bootstrapping: A proposal

Abstract

Cite

Register to see more suggestions