The problem of fitting logistic regression to binary model allowing for missppecification of the response function is reconsidered. We introduce two-stage procedure which consists first in ordering predictors with respect to deviances of the models with the predictor in question omitted and then choosing the minimizer of Generalized Information Criterion in the resulting nested family of models. This allows for large number of potential predictors to be considered in contrast to an exhaustive method.We prove that the procedure consistently chooses model t* which is the closest in the averaged Kullback-Leibler sense to the true binary model t. We then consider interplay between t and t* and prove that for monotone response function when there is genuine dependence of response on predictors, t* is necessarily nonempty. This implies consistency of a deviance test of significance under misspecification. For a class of distributions of predictors, including normal family, Rudd's result asserts that t* = t. Numerical experiments reveal that for normally distributed predictors probability of correct selection and power of deviance test depend monotonically on Rudd's proportionality constant η.
CITATION STYLE
Mielniczuk, J., & Teisseyre, P. (2015). What do we choose when we err? Model selection and testing for misspecified logistic regression revisited. In Challenges in Computational Statistics and Data Mining (Vol. 605, pp. 271–296). Springer International Publishing. https://doi.org/10.1007/978-3-319-18781-5_15
Mendeley helps you to discover research relevant for your work.