The number of variables in a regression model is often too large and a more parsimonious model may be preferred. Selection strategies (e.g. all-subset selection with various penalties for model complexity, or stepwise procedures) are widely used, but there are few analytical results about their properties. The problems of replication stability, model complexity, selection bias and an overoptimistic estimate of the predictive value of a model are discussed together with several proposals based on resampling methods. The methods are applied to data from a case-control study on atopic dermatitis and a clinical trial to compare two chemotherapy regimes by using a logistic regression and a Cox model. A recent proposal to use shrinkage factors to reduce the bias of parameter estimates caused by model building is extended to parameterwise shrinkage factors and is discussed as a further possibility to illustrate problems of models which are too complex. The results from the resampling approaches favour greater simplicity of the final regression model.
CITATION STYLE
Sauerbrei, W. (1999). The use of resampling methods to simplify regression models in medical statistics. Journal of the Royal Statistical Society. Series C: Applied Statistics, 48(3), 313–329. https://doi.org/10.1111/1467-9876.00155
Mendeley helps you to discover research relevant for your work.