In environmental sciences, one often encounters large datasets with many variables. For instance, one may have a dataset of the monthly sea surface temperature (SST) anomalies (anomalies are the departures from the mean) collected at l=1,000 grid locations over several decades, i.e. the data are of the form x=[x 1, ⋯, xl ], where each variable xi (i=1, ⋯, l) has n samples. The samples may be collected at times tk (k=1, ⋯, n), so each xi is a time series containing n observations. Since the SST of neighboring grids are correlated, and a dataset with 1,000 variables is quite unwieldy, one looks for ways to condense the large dataset to only a few principal variables. The most common approach is via principal component analysis (PCA), also known as empirical orthogonal function (EOF) analysis (Jolliffe 2002). In this chapter, we examine the use of MLP NN models for nonlinear PCA (NLPCA) in Section 8.2, the overfitting problem associated with NLPCA in Section 8.3, and the extension of NLPCA to closed curve solutions in Section 8.4. MATLAB codes for NLPCA are downloadable from http://www.ocgy. ubc.ca/projects/clim.pred/download.html.The discrete approach by self-organizing maps is presented in Sections 8.5, and the generalization of NLPCA to complex variables in Section 8.6. © 2009 Springer Netherlands.
CITATION STYLE
Hsieh, W. W. (2009). Nonlinear principal component analysis. In Artificial Intelligence Methods in the Environmental Sciences (pp. 173–190). Springer Netherlands. https://doi.org/10.1007/978-1-4020-9119-3_8
Mendeley helps you to discover research relevant for your work.