SEMIPARAMETRIC MAXIMUM LIKELIHOOD ESTIMATION WITH TWO-PHASE STRATIFIED CASE-CONTROL SAMPLING

0Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We develop statistical inference methods for fitting logistic regression models to data arising from the two-phase stratified case-control sampling design, where a subset of covariates are available only for a portion of cases and controls, who are selected based on the case-control status and fully collected covariates. In addition, we characterize the distribution of incomplete covariates, conditional on fully observed ones. Here, we include all subjects in the analysis in order to achieve consistency in the parameter estimation and optimal statistical efficiency. We develop a semiparametric maximum likelihood approach under the rare disease assumption, where the parameter estimates are obtained using a novel reparametrized profile likelihood technique. We study the large-sample distribution theory for the proposed estimator, and use simulations to demonstrate that it performs well in finite samples and improves on the statistical efficiency of existing approaches. We apply the proposed method to analyze a stratified case-control study of breast cancer nested within the Breast Cancer Detection and Demonstration Project, where one breast cancer risk predictor, namely, percent mammographic density, was measured only for a subset of the women in the study.

Cite

CITATION STYLE

APA

Cao, Y., Chen, L., Yang, Y., & Chen, J. (2023). SEMIPARAMETRIC MAXIMUM LIKELIHOOD ESTIMATION WITH TWO-PHASE STRATIFIED CASE-CONTROL SAMPLING. Statistica Sinica, 33(3), 2233–2256. https://doi.org/10.5705/ss.202021.0214

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free