The purpose of the paper is to present a new statistical approach to hierarchical cluster analysis with n objects measured on p variables. Motivated by the model of multivariate analysis of variance and the method of maximum likelihood, a clustering problem is formulated as a least squares optimization problem, simultaneously solving for both an n-vector of unknown group membership of objects and a linear clustering function. This formulation is shown to be linked to linear regression analysis and Fisher linear discriminant analysis and includes principal component regression for tackling multicollinearity or rank deficiency, polynomial or B-splines regression for handling non-linearity and various variable selection methods to eliminate irrelevant variables from data analysis. Algorithmic issues are investigated by using sign eigenanalysis. © 2006 Royal Statistical Society.
Mendeley helps you to discover research relevant for your work.
CITATION STYLE
Li, B. (2006). A new approach to cluster analysis: The clustering-function-based method. Journal of the Royal Statistical Society. Series B: Statistical Methodology, 68(3), 457–476. https://doi.org/10.1111/j.1467-9868.2006.00549.x