Feynman-Hellmann theorem and signal identification from sample covariance matrices

5Citations
Citations of this article
29Readers
Mendeley users who have this article in their library.

Abstract

A common method for extracting true correlations from large data sets is to look for variables with unusually large coefficients on those principal components with the biggest eigenvalues. Here, we show that even if the top principal components have no unusually large coefficients, large coefficients on lower principal components can still correspond to a valid signal. This contradicts the typical mathematical justification for principal component analysis, which requires that eigenvalue distributions from relevant random matrix ensembles have compact support, so that any eigenvalue above the upper threshold corresponds to signal. The new possibility arises via a mechanism based on a variant of the Feynman-Hellmann theorem, and leads to significant correlations between a signal and principal components when the underlying noise is not both independent and uncorrelated, so the eigenvalue spacing of the noise distribution can be sufficiently large. This mechanism justifies a new way of using principal component analysis and rationalizes recent empirical findings that lower principal components can have information about the signal, even if the largest ones do not.

Author supplied keywords

Cite

CITATION STYLE

APA

Colwell, L. J., Qin, Y., Huntley, M., Manta, A., & Brenner, M. P. (2014). Feynman-Hellmann theorem and signal identification from sample covariance matrices. Physical Review X, 4(3). https://doi.org/10.1103/PhysRevX.4.031032

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free