Statistical Inference for High-Dimensional Linear Regression with Blockwise Missing Data

  • Xue F
  • Ma R
  • Li H
N/ACitations
Citations of this article
8Readers
Mendeley users who have this article in their library.

Abstract

Blockwise missing data occurs frequently when we integrate multisource or multimodality data where different sources or modalities contain complementary information. In this paper, we consider a high-dimensional linear regression model with blockwise missing covariates and a partially observed response variable. Under this semi-supervised framework, we propose a computationally efficient estimator for the regression coefficient vector based on carefully constructed unbiased estimating equations and a multiple blockwise imputation procedure, and obtain its rates of convergence. Furthermore, building upon an innovative semi-supervised projected estimating equation technique that intrinsically achieves bias-correction of the initial estimator, we propose nearly unbiased estimators for the individual regression coefficients that are asymptotically normally distributed under mild conditions. By carefully analyzing these debiased estimators, asymptotically valid confidence intervals and statistical tests about each regression coefficient are constructed. Numerical studies and application analysis of the Alzheimer's Disease Neuroimaging Initiative data show that the proposed method performs better and benefits more from unsupervised samples than existing methods.

Cite

CITATION STYLE

APA

Xue, F., Ma, R., & Li, H. (2025). Statistical Inference for High-Dimensional Linear Regression with Blockwise Missing Data. Statistica Sinica. https://doi.org/10.5705/ss.202022.0104

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free