On the use of the observation-wise k-fold operation in PCA cross-validation

  • Saccenti E
  • Camacho J
  • 11


    Mendeley users who have this article in their library.
  • 7


    Citations of this article.


Cross-validation (CV) is a common approach for determining the optimal number of components in a principal component analysis model. To guarantee the independence between model testing and calibration, the observationwise k-fold operation is commonly implemented in each cross-validation step. This operation renders the CV algorithm computationally intensive, and it is the main limitation to apply CV on very large data sets. In this paper, we carry out an empirical and theoretical investigation of the use of this operation in the element-wise k-fold (ekf) algorithm, the state-of-the-art CV algorithm. We show that when very large data sets need to be cross-validated and the computational time is a matter of concern, the observation-wise k-fold operation can be skipped. The theoretical properties of the resulting modified algorithm, referred to as column-wise k-fold (ckf) algorithm, are derived. Also, its performance is evaluated with several artificial and real data sets. We suggest the ckf algorithm to be a valid alternative to the standard ekf to reduce the computational time needed to cross-validate a data set.

Author-supplied keywords

  • Cross-validation
  • Dimensionality assessment
  • Principal component analysis

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Get full text


  • Edoardo Saccenti

  • José Camacho

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free