With the availability of massive data sets, making accurate inference with low computational cost is the key to improving scalability. When both the sample size and the dimensionality are large, naively applying the de-biasing idea to derive confidence intervals can be computationally inefficient or infeasible as the de-biasing procedure increases the computational cost by an order of magnitude compared with the initial penalized estimation. Therefore, we suggest a split and conquer approach to ameliorate the scalability in the de-biasing procedure and show that the length of the established confidence interval is asymptotically the same as that using the data all at once. Moreover, a significant improvement in the largest split size is demonstrated by separating the initial estimation and the relaxed projection steps, which reveals that the sample sizes needed for these two steps with statistical guarantees are different. Last but not least, a refined inference procedure is proposed to address the inflation issue in finite sample performances when the split size indeed gets large. Both computational advantage and theoretical guarantee of our new methodology are evidenced by numerical studies.
CITATION STYLE
Zheng, Z., Zhang, J., Li, Y., & Wu, Y. (2020). Partitioned approach for high-dimensional confidence intervals with large split sizes. Statistica Sinica, 30(1). https://doi.org/10.5705/SS.202018.0379
Mendeley helps you to discover research relevant for your work.