Abstract
Suppose that data are generated according to the model f(y|x; θ) g(x), where y is a response and x are covariates. We derive and compare semiparametric likelihood and pseudolikelihood methods for estimating θ for situations in which units generated are not fully observed and in which it is impossible or undesirable to model the covariate distribution. The probability that a unit is fully observed may depend on y, and there may be a subset of covariates which is observed only for a subsample of individuals. Our key assumptions are that the probability that a unit has missing data depends only on which of a finite number of strata that (y, x) belongs to and that the stratum membership is observed for every unit. Applications include case-control studies in epidemiology, field reliability studies and broad classes of missing data and measurement error problems. Our results make fully efficient estimation of θ feasible, and they generalize and provide insight into a variety of methods that have been proposed for specific problems.
Author supplied keywords
Cite
CITATION STYLE
Lawless, J. F., Kalbfleisch, J. D., & Wild, C. J. (1999). Semiparametric methods for response-selective and missing data problems in regression. Journal of the Royal Statistical Society. Series B: Statistical Methodology, 61(2), 413–438. https://doi.org/10.1111/1467-9868.00185
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.